SnobolU 



A Computer Proqrammlng Language 
for the Humanities 



Robert Gaskins, Jr, 



Lauca Goulfl 



Tini versit y of California 
Spring, 1977 



Copvriqht 1972 by Pobert Gaskins, .Tr., and laura Goul^i 

All Rights ^cservevl 



Nothing amuses more harmlessly than 
computation, and nothing is oftener 
applicable to real business or 
speculative inquiries. A thousanfl 
stories which the iqnorant tell, anrj 
believe, die away at once when the 
couputist takes them in his grip. 

Samuel Johnson, 
Letter to Sophia 'fhrale 
{at Bath), July 2U, 1783 



CONTENTS 

[Notp: the starred sections are not yet available U/1/72 1 
Preface vii 



1R. Pnmnnf-or Prnni-a mmi nrr in «!n(-»Hr«1 

Devising a Program 1 

writing a ?nobol Program Text U 

Input and Output 5 

Execution of a Snobol Program 6 

*1B. Computer Applications Hsing Snobol 



2A. AssigniTiGnt 

Literal Values 8 

Variables ^ 

Assignment Pulefe 10 

The Null Value 11 

The <^pecial Variable OUTPTf^ 12 

The Special Variable TNPHT 13 

Other Forms of Input and OutDUt 1U 

Procedures ^U 

The 'T-P.THO Procedure 15 

The '^IZi^O Procedure 16 

Operators 16 

The Concatenation Operator I'' 

The ^^ithInettc opr?rators If 

A Complete Snobol Program Text 20 



*2B. Examples and Applications 



3A. The Flow of Control 21 

Labels 21 

(lo-to's 22 

The Special Trans^'er ^.nv 23 

Failure of the Pule 2H 

Failure of IN POT 2'» 

Evaluation Pules 25 

Test Procedures 26 

The Test Procedures TDENT() and DIFFER () 26 

The Test Procedure LIT () 2f 

Arithrfotric '''est Procedures 28 

Test Procedures within Assignment Pules 2R 

Loops 2^ 

Loops Controlled by Data Conditions 30 

Loops Controlled by Counts 31 

*?R. Fxamples and Applications , 



UA. Pattern Katchinq 3:^ 

The Pattern r^atchinq Rule 13 

The Peplacement Rule ?U 

The Alternation operator 3*1 

The Pattern Procedures ANY () and NOTANY() 36 

The Conditional \ssiqninent Operator 38 

Concatenation of Patterns 39 

The Inniediate Assignment Operator ao 

The Pattern Procedures SPAN() and BREAK () 41 

The Pattern Procedure L5N () U2 

The ANCHOR () Procedure 43 

The Pattern Procedures TAB () and RTAB() 44 

The Pattern Procedures PO? () and RPOS () U6 

The Pattern Procedure ARHNO(> 45 

Assigning Patterns to Variables 49 

The Deferred Evaluation Operator 50 

The Special Pattern Variables APB and REW 5? 

A Program to Illustrate Pattern-Matching 53 



*4B. Fxampl<;s and Applications 



5A. Indirect Referoncinq < 

Tho Indiroct Peferencing Operator 5'"^ 
Tho Opfr<ind of the Indirect "eCerencinq 

Operator 57 
A Program to Produce a Character Count 59 
Concatenation within the Operand 60 
A Proqrani to Produce a Frequency Table 63 
A Program to Produce a Vord Count 65 
Indirect Referencing within the Go-to 67 



*5B. Examples and Applications 



6A. Programmer-defined Procedures 70 

Defining a Procedure 70 

The DEFINEO Procedure 72 

Procedure Bodies 74 

The Returns RETnnN, NPETURN, and FRETIIRN 75 

Procedure Calls 76 

The Passing of Arguments 77 

Additional Internal Variables 78 

References to External Variables RO 

5>ide-ef f ects of i^rocedures R4 

Levels of Internal Variables 87 

•^he Use of NRFTHRN to Return a Variable 90 

The APPLYO Proceilure 92 

Using a Library of Procedures 9U 



*6B. Examples and Applications 



7A. Arrays ^qo 

Creating an Array 100 

Array Items and Item References 101 

Comparison with Indirect Referencing 102 

Multi-tlimensional Arrays 103 

The APPAY{) Procedure 10i» 

Selectors 106 

Failure of an Item Reference 106 

Special Problems Concerning Item References 107 

The IT?M() Procedure 1 0H 

The PPOTOTYPF() Procedure 110 

The TYPEO Procedure 111 

Procedure to Return a Selector 113 

Procedure to Interchange Two Arrays 114 

The Name Operator 116 

Forming ail Selectors of an Array 118 

Procedure to Retiirn the "Next" Selector 120 

Procedure to Return a Copy of any Array 122 



*7B. Examples and Applications , 

*9A. Programmer-defined Data Structures 
*8B. Examples and Applications 



Appendixes 

A. Summary of Predefined Procedures 123 

I. Program Procedures 127 

A. Test Procedures 127 

B. Result Procedures 12P 

C. Data TTocedures 130 
TI. System Procedures 135 

A. Declarations 135 

B. Access to <^ystem Information 136 

C. Requests for System Actions 1U3 

D. Input/Output Procedures 146 

B. Summary of Predefined Pattern Variables .150 

ARB an-^ RE"1 150 
BAL 150 
FAIL 150 
ABORT 151 
FENCE 151 

C. Sumnary of Operators 153 

D. Summary of Procedure Execution 15i* 

*E. The Pattprn-Matching Algorithm 



♦ F. Stimmary of Snobol ArithniGtic ♦ 

♦G. Summary of Input/Output Proce(lur«=?s 

H. Proqram Text Psprenentation ISS 

Statement Format 1S5 

Continuation Cards 155 

Coirnnent Cards 1 "^6 

listing Control Cards 156 

Extended Syntax of Snobol Statements 156 

T. Character Set Representations ., 15R 

J. Syntax of Proqram Texts 161 

K. Summary of Compile- time Error Messages 166 

L. Summary of Fxecution-tirae Error Messages 167 

M. Non-standard Features of Berkeley Snobol 172 

I. Features which are Handled Differt^ntly 173 

Procedures 173 

Operators 174 

Keywords 175 

Datatypes 175 

System Transfers 175 

Output 175 

Proqram Representation 176 

The Program Listing 177 
IT. Features Absent from the Berkeley Version 177 

Procedures 177 

Operators 179 

Keywords 179 

Pattern Variables 1B1 

Datatypes 1R1 

Pattern Matching 181 

Arithmetic 181 

Output 181 
III. Features not Present in the Rell Version 182 

Procedures 182 

Index 183 



PREFACE 



Edmund Fuller has describGd hoaring an interview in 
which Fdward R. Murrow asked Mickey Spillane how he could 
bring himself to pander to the public taste by writing th*^ 

kind of hooks he did? <;ni 1 1 a no* c; 1iirainrni<= ■rcml v ^ r-r^r\T-A i n r, 

to Fuller, vas: "I write the kind of books I want to read 
and can't find." 

We, with much the same motivation, have written this 
description of SnobolU, a computer programming language for 
the humanities. Our own training and interest is in the 
study of language and literature, and so the examples and 
exercises Eire directed particularly toward the n-arhino' 
manipulfit io!j of linguistic data and literary ie;cts. Fveii so, 
the description should be useful to students of Kany 
disciplines, since the first part of each chapter presents 
features of the language in a generalized way, and the 
particular examples in the second part of each chapter havf^ 
been chosen to exhibit principles and technigues which can 
easily be applied to verbal or symbolic data in a wide rang'-' 
of humanistic and social science applications. 

T'his presentation of !inobol'» is particularly designed 
for members of the University of California community who 
have no previous knowledge of computers or conn'ttf^r 
programming. It describes a dial act of the languaqo for 
Control Data Corporation 6000 series machines, implemented 
at the Berkeley Computer Center by Paul ^IcJones and Charles 
Simonyi; Mr. McJones has reviewed our work as it has 
progressed, and has made many helpful suggestions. 

It is intended that this manual will be expanded to 
provide a complete description of the ?nobol4 language and 
of various related facilities available at the i^erkeley 
Computer Center which are of interest to Snobol users. We 
would naturally be pleased to receive suggestions for 
improvements and additions from readers. We hope that few 
mistakes remain, even in this preliminary version, but each 
of us blames the other for any that may be found. 



1A. COMPUTER PROGRAMMING IN ?;N0B0L 

Snobol is a proqramminq lanqnaqe, one of in?jny such 
artificial lanqviaqes which may be user! to convey 
instructions to a computer, "lost compnteus may be instructerl 
in a wide variety of proqramming lanquaqes; these langnaaes 
differ from one another, as do natural languages, bv having 
different vocabtilaries and syntactic structures. '^ore 
importantly, however, they differ in the range of concepts 
which they are capable of expressing. 

Different programming languages have been developed for 
different kinds of problems or probloir. areas. Some have been 
devised primarily for describing goneral numeric or 
algebraic problems, others for describing the structure o^ 
business records and files, still otliers for highly 5;pecific 
purtoses such as controlling machine tools, sicuilating 
economic systems, or making computer-generated movit?s. 
Snobol is distinguished by very powerful and general 
capabilities for manipulating strims of characters, mailing 
it particularly corivenient for working with data trnm areas 
such as linguistics, literature, verbal behavior, and the 
humanities in general, it is also very useful for eKprcssiua 
sopliisticated non-numeric problems in the field of comp'.)ito>r 
science. 

Devising a Program. A description of how a computer is 
to go about solving a problem consists of a list of task:- or 
actions to be performed. A specification in some proyrarjir.i mq 
language which describes such a series of tasks corupletoly 
is called a "program text." Before a program text can be 
written, the task which it is to describe must be clearly 
understood. If, for example, a task has been expressed in 
English as "find all vowels in a word," the followinq 
questions must he resolved before the proqram;iiing of the 
task in some programiiiing language can be undertaken: 

(1) what is a vowel? 

(2) what is a word? 

(3) what should be done vith the vowels which are 
fourd? 

The answers might be as follows: 

(1) one of the characters A,E,I,0, or U 

(2) a string of characters to be provided as data to 
the program 

(3) count then and then print the total 



1A, Computer Programming in Snobol 



Given these clarifications, one can then translate the 
unriqorous English sentence "find all vowels in a word" into 
a rigorous step-by-step description of what must be done; 
this step-by-step description can then be translated again 
into a series of statements in an appropriate programming 
language. The intermediate translation Day exist only in the 
mind of the programmer, as is often the case if the task is 
a simple one, or may l3e recorded in some fashion so that it 
may be considered for correctness. 

One of the best ways of recording a step-by-step 
description is to write down a series of numbered statements 
specifying exactly what is to be done. These stateiaents are 
still in English, but a much more detailed and careful 
English than that of the original problem. The statements 
differ from the sentences of a natural language paragraph in 
that they are net intended to be processed onl^y once or in 
the order in which they are presented; hence, the statements 
are numbered so that the order in which they are to be 
processed, often repeatedly, may be specified. A set of 
numbered statements describing how to count all the vowels 
in a series of words and to print the co'ints wight look as 
follows: 

START 

(1) Get the next word; if no more words, STOP. 

(2) Print that word. 

(3) Set the tally to zero. 

(U) Get the next character of this word; if no more 
characters remain, go to (7); otherwise go to the next 
statement. 

(5) Determine whether or not this character is an 
R,?,I,0, or U; if it is not, go back to (UJ ; otherwise go to 
the next statement. 

(6) Add one to the tally which is keeping track of the 
number of vowels in this word; go back to (U) . 

(7) Print the value of the tally, which now represents 
the total number of vowels in the word. Go back to (1) and 
attempt to get another word. 

Note that this program description has been augmented 
to count the vowels in any number of words, one after 
another, and to print the counts separately. It would not be 
useful to write a program to count the vowels in a single 
word only, as the counting could be accomplished by hand 
much faster than the program could be written. (However, for 
more complicated tasks, a program can often be written much 
more <«asily than the task can be performed even once by 
hand; that such a program could then bo used again might 
well be of secondary importance.) 



1A. Computer Programming in Snobol 



Another method of recording a step-by-step description 
is to use what is called a "flow chart." In a flow chart the 
specification of what is to be done next, or the "flow of 
control," is indicated by means of lines and arrows rather 
than by phrases of the form "qo back to (1)." A flow cliart 
equivalent to the numbered statements just provided might 
look as follows: 



START 
I 

l<— 
I 

( I > V 



I get next I Fail 

I word I > STOP 

I Succeed 
I 

C2 > V 



r- 

! 


1 

print the | 


1 


word 1 
. . 1 



3) 



I T 

I set tally | 
I to zero f 

I J 

f 

, < ^ 

I < ^ , 

<♦> V <5> jFail <6> I 



I get next | Succeed 1 test for {Succeed I add one | 
! character I >| A,E,I,0,U I >| to tally | 

I J L J C J 

IFail 
I 

< 7) V 



r 1 

J print I 

I value of I J 

I tally I 

I J 



ia. Computer Programning in Snobol 



lritin3_a_Snobol_Pro(3raffl__Text^ Now that a detailen 
method for solving the problem is clearly understood, it may 
be translated into a set of statements in the Snobol 
language. Seven Snobol statements are provided below, one 
for each of the numbered English sentences, or, 
equivalently, one for each box of the flow chart. These 
statements are provided here to illustrate the close 
correspondence between the Snobol statements and the step- 
by-step description, to give some indication of the 
appearance of a programming language, and to point out some 
features of the Snobol language in particular; a complete 
discussion of the meaning of these statements must be 
deferred to later chapters of the text. (Comments, beginning 
with asterisks, have been inserted for spacing and to 
explain the purpose of the statements.) 

* STEP 1: REAE IN THE NEXT WOSD - IF NO MORE WORDS, STOP 

READ WORD = TRIM (INPUT) : F(END) 

♦ 

* SlEP 2: PRINT THE WORD JUST READ IN 

OUTPUT - WORD 

* STIP 3: SET THE TALLY TO ZERO 
* 

TALLY = 
* 

* STEP Hz GET THE NEXT CHARACTER CE THIS WORD - IF NO :10RE 

* CHARACTERS, PRINT THE VOWEL COUNT FOR THIS WORD 
* 

GETCHAR WORD LEN{1) . CHAR = NOLL : F (PRINT) 
* 

* STEP 5: SEE IF THIS CHARACTER IS A VOWEL - IF NOT, 

* GO BACK AND GET NEXT CHARACTER 

CHAR ANY('AEIOO') : F (GETCHAR) 

* STEP 6: CHARACTER IS A VOWEL - ADD ONE TO THE TALLY 

TALLY = TALLY + 1 : (GETCHAR) 

* 

* STEP 7: PRINT NUMBER OF VOWELS AND RETURN TO 

* REAE IN THE NEXT WORD 
* 

PRINT OUTPUT = TALLY : (READ) 

END 



1A. Computer Programniing in 55nobol 



Each Snobol statement consists of three basic parts, 
any of which may be absent. These pacts are called the 
label, the rule, and the go-to. The label is the first part 
and serves to identify the statement (as did the numbers in 
the English description above) ; the rule is the middle part 
and specifies some action to be performed; the go-to is the 
last part and indicates which statement is to be v-ronsidered 
next by providing its label in parenthesis, (The F within 
the first three go-to's above indicates that the go-to is to 
be taken only if the action specified by the rule preceding 
it fails; otherwise control is sent to the nejct statement of 
the series.) 

Il}£iii_2.Il^_£lli,£illi Before the statements of a program 
text can be used to instruct a computer, they must first be 
put in what is called "machine-readable form." For instance, 
they must be punched on cards to be read into the computer's 
memory via a card reader, or typed in on a teletype 
connected to the computer. The data to be manipulated, such 
as the words whose vowels are to bo counted, are seldom 
explicitly provided within a program text, but are prepared 
separately and must also be put in machine-readable form 
before they can be accessed. 

The Snobol language provides facilities for reading in 
units of data, called "records," and for writing out the 
resclts of manipulating this data. These are called "input" 
and "output" facilities. The first statement of the program 
text above indicates that some input is needed; in 
particular, it specifies that an indefinite number of words, 
one at a time, are to be read from a "file" of data which 
must be supplied with the program. The second statement 
specifies that some output is to be produced; in particular, 
that the word just read in is to be printed at the beginning 
of a new line of printer paper. The last statement specifies 
that the number of vowels found within that word is to be 
printed on the following line. 

If the file of data to be used as input for the program 
text above were the following list of words 

HIPEOPCTAMUS 

HIPPOS 

HIFFOSTDEPns 

HIFFCSFONHIA 

HIFFCTIGRINE 

HIPFOTCHY 

HIFFOTPAGINF 

HIFFCTRAGTIS 



1A. Computer Programming in Snobol 



then the output produced by the program would be the list 

HIPPOPOTAMUS 
5 

HIPFCS 

2 

HIPECSIDEROS 

5 

HIFFOSFCNr,Tf. 

5 

HIFFCTIGRINE 

5 

HIFFOTCMY 

3 

HIPFOTBAGTNE 

5 

HIFFOTFRGOS 

U 

Results from executing a program may be printed on 
paper for personal perusal, written on magnetic storage 
media, or punched on cards. Since the last two are machine- 
readable as well as machine-writeable, the output may be 
used again, without modification, as input data to be 
further processed by still another program. 

Execution of_a_Snohol_Proa[rarai It is not enough for a 
computer" to have available to it both a program text and 
seme data in machine-readable form; it must also have 
available to it a "translator" or "system" to process the 
language in which the program text has been written. A 
computer may have available any number of language 
processors and hence may be able to "understand" any number 
of languages. A processor itself consists of a program, 
written in some programming language (often in a language 
that is basic and unique to a particular computer, but 
possibly in Snobol) . The data which such a system will use 
is a program text in the language for which it is the 
processor. 




lA. Ccinputer Programming in Snobol 7 

If a statement is well-formed, it is converted by the 
compiler into a representation ("Code") suitable for later 
processing by the interpreter; if it is not well-formed, it 
is flagged as being syntactically incorrect. All statements 
of the program text are processed, even if incorrect ones 
occur, so that all syntactic errors are found. The 
programmer can locate the incorrect statements by inspecting 

subuit his program text as data for the compiler to process. 

If no compile-time errors occur, the message SUCCESSFUL 
CCMFILATION is written at the end of the program listing. 
The interpreter then starts processing, using the converted 
statements of the program text as its data; the entire set 
of converted statements ronresr^nting a program text is 
called a "program." The interpreter executes the proaram, 
causing the computer to perform whatever task has been 
described. It starts by executing the first statement of the 
program and then proceeds to process the converted 
statements in the order specified by the go-to' s, reading 
input from a data file and producing outout whenever 
reguested. Execution continues until the task is finished 
(as signified here by the END statement) or until an 
execution-time error (such as a request to multiply 'CAT* by 
•CATALOG') occurs. If this happens, the prograraner can 
inspect the error message printed by the interpreter and can 
attempt to determine his mistake. He can then modify th*^ 
program text and submit it cnce again to the Joint pros 
of ccmpilation and execution. 



Af-occrci c: 



2A. ASSIGNHFNT 

A snobol prograra text consists of a sequence of 
statements in the Snobol language. These statements acp 
coEpiled to produce a series of instructions to the 
computer, causing it to store data in its memory, to perform 
operations on this data, and to preserve the results for 
human inspection and/or for further processing by machine. 
The data to be manipulated is usually stored externally to 
the program and is read in by the program as it is needed. A 
few data values, however, are often written directly in the 
program text itself. These values may be of several 
different types, but are most often simply strings of 
characters. 

Lit era,l_ya l\ie s. Strings are sequences of characters 
which may" be" of any length and may be composed of any 
characters in the computer's character set (see Appendix I) . 
Strings whose characters are written directly in the program 
text are called string literals and are designated by beinq 
delimited by either single or double quotes; a string 
consisting of the five English vowels may be written in a 
Snotol program text as either 

•AEIOU* or "AEIon" 

with exactly the same effect. This permits a string literal 
to contain whichever guote mark is not being used as the 
deliiriter without confusion. For example, 

••LADYnCHATTERlEY • SaLOVEB" 

is a string of 23 characters, while 

• "AY! "nHEaSAIDnBBIEFLY. • 

is a string of 22 characters. Notice that spaces 
(represented here by the symbol a) are treated like any 
other characters in string literals. 

Strings consisting of nothing but digits with perhaps 
an initial plus sign or minus sign are called numeric 
strings and are of datatype Integer; all other strings are 
of datatype String. Those strings which are of datatype 
Integer, and which do not have an initial sign, may be 
represented in the program text with or without surrounding 
quotes. If quotes are net used, as in 

669 7449 23 



2A. Assignment 



then these numeric strings are called integer literals. When 
an integer literal is stored in the memory, any leading 
zerces it may have had are removed; that is, the integer is 
stored in a "canonical" form. (The canonical form of zero is 
the single character 0.) Thus 00023 and 23 and •23* all have 
identical representations in the meiaory. Leading zeroes may 
be preserved for non-nnneric applications by representing 
integers in the program text as string literals containing 
leading zeroes. For example, '00023' would be stored as a 
five-character string, while •23' would be stored as a two- 
character string. String literals are always stored within 
the ccmputer«r. memory exactly as they are represented in the 
procram, while integer literals are always stored in 
canonical form. In what follows, the term string will be 
used to include objects of datatype Integer as well as 
objects of datatype String. 

l5Eia:bleSi Once a value of any datatype is stored 
within the computer's memory, some method must be provided 
for referring to it so that it may be used repeatedly 
throughout the program. Each value is stored by being 
assigned to a variable, which serves as a reference, or 
pointer, to the value. Every variable has a name, and any 
non-null string of characters may be used as the name of a 
variable. That is, the name of a variable may be of any 
length and may be composed of any characters of the 
character set. Those names which begin with a letter and 
consist of an arbitrarily long sequence of letters, digits, 
and periods are said to be in "identifier form" and may bo 
written directly in the program text. Thus 

PHYMF1 VOWELS DNSIJCCESSFDL. COGNATES P.V.C 

are all valid representations of variables in program texts 
since they are all identifiers, while 

IRHYME ..VOWELS lEST/3 P-V-C 

are not, since the first hwo don't begin with a letter, and 
the last two contain impermissible characters. 

String literals, integer literals, and variables thus 
have representations in a program text which allow thpm to 
be easily differentiated from one another; string literals 
begin with a quote (and must end with a quote as well) , 
integer literals begin with a digit, and names of variables 
begin with a letter. (Other ways of representing variables, 
and particularly variables whose names are not in the form 
of identifiers, are diricussed in Chapter 5 and Chapter 7.) 



2A. Assignment 10 



*§sianment_RuleSi The aost fundamental kind of rule in 
the Snobol language~is the assignment rule which is used to 
assign a value to a variable. The variable is usually 
represented by an identifier and the value can be a String 
or an Integer or may be of any other datatype (Real, 
Pattern, Array, etc.). For example, the assignment rule 

VOWELS = «AEIOn» 

Specifies that the five-character string AEIOtl is to be 
stored in the memory as the value of the variable named 
VOWELS. Similarly 

COUNT = HI 

specifies that the integer 47 is to be stored as the value 
cf the variable named COUNT. 

In general, an assignment rule has the meaning: let the 
variable represented on the left side of the equals siqn 
refer to the value specified on the right side of the equals 
sign. (It is obvious that the equals sign does not have its 
usual arithmetic meaning in an assignment rule; it is being 
used as an "assignment sign.") 

An assignment rule may have a variable name on its 
right side, rather than a literal. When a variable occurs on 
the right, it is used to refer to its value. Thus the 
sequence of rules 

ALEPH = ' ABCDEFGHIJKLMNOPQESTUVWXYZ* 
ALPHA 1 = ALEPH 
LETTERS = ALEPH 

specifies that the variable ALEPH is to have as its value 
the 26-character string of the alphabet, that the variable 
ALPHA1 is to have as its value the curcont value of ALEPH, 
and BO forth. In an anflignment rul<», wlmn t ho nnmn of ,i 
\rariable occurs on the left of the assignment sign It stands 
for the variable; when the name of a variable occurs on the 
right, it stands for the value of that variable. 

The relation between a variable and its value need not 
be a permanent one. Usually a variable is assigned a variety 
of different values in the course of executing a single 
prcgram (hence the term "variable"). A variable named WORD, 
for example, might be assigned as its successive values each 
new word encountered in a group of data, thus changing its 
value 10,000 tiines for a text 10,000 words in length. Each 
tine a value is assigned to a variable, the previous value 



2A. Assignment -j i 



of the variable is lost; thus the value of a variable is 
always the one most recently assigned. 

I]lS_Null_7alue^ All variables, before they have been 
assigned any other value, start out with the "empty" or null 
value. After a variable has been assigned a non-null value, 
it may be given the null value again by executing an 
assignment rule with a null value en the right side, such as 

VOWELS = 

The null value may also be represented by an "empty" 
literal, one with no characters in it, as in 



or 



VOWELS = •• 

VOWELS = "" 

or by a variable which has a null value, such as 

VOXELS = NULL 
or 

VOWELS = ANYTHING 

if the variables NULL and ANYTHINr, have null values when the 
rules are executed. (In all examples which follow, wherever 
the variable NULL occurs it is assumed by convention to have 
a null value.) 

The null value is a special entity in Snobol, distinct 
from all other values, and has a variety of important uses 
in the language. Notice particularly that it is 
distinguished from the strings space and zero. Thus 

VOWELS = 'a* 

VOWELS = 'O* 

VOWELS = 



and 



are each assignments which give the variable named VOWELS a 
non-null value; the first value is of datatype String, while 
the last two are of datatype Integer. Although the null 
value is a distinct value, it is not given a special 
datatype; by convention the null value is of datatype 
Integer. Thus the general term string, which includes 
objects of datatype String as well as of datatype Intetjer, 
includes also the null value unless specified otherwise. 



2A. Assignment ''2 



The Special Variable OUTPnT^ Once values have been 

stored within the computer* s~memory, they nay be printed out 
by assigning them to the special variable OUTPUT. This 
variable differs from others in having the following special 
prcferty: whenever the variable OOTPnT is assigned a string 
as its value, that value is transmitted to a file to be 
printed on a line printer which is attached to the computer. 
Each execution of a rule in which OUTPUT is assigned such a 
value results in the printing of a new line of information 
(a record). For example, execution of either 



or 



OUTPUT = •AEIOU» 
OUTPUT = VOHlvLS 



(if the current value of the variable VOWELS is the string 
AEICU) would cause the five letters AEIOO to be printed at 
the left margin of the next available line of the output 
paper. 

If OUTPUT is assigned a null value, as in 
OUTPUT - 
OUTPUT = NUll 



or 



the result is a null record, which appears as a blank line 
en the output paper. 

OUTPUT may be assigned a string of any length as its 
value, but only the first 132 characters, the number of 
characters available per line on a printer, will be printed. 
The entire string, however, remains the value of OUTPUT. and 
may thus be assigned as the value of other variables as 
well. The variable OUTPOT, like any other variable, may be 
used on either side of an assignment rule, as in the 
seguence 

OUTPUT = VOXELS 
OUTPUT = OUTPUT 
COPY = OUTPUT 

whose execution would result in the two lines of output 
ATTCU 

AEiru 

Note that although the special variable OUTPU-^ is 
involved in all three rules, no printing is produced by the 
third because it does not specify that OUTPUT is to be 



2A. Assignment -j 3 



assigned a valup; rather, the value of OUTPUT, which at the 
time the rule is executed is the string AEinn, is assigned 
to the variable COPY. 

IlLe_Si5eci3l_y2riab2e_INP]JTi Data nay be read into the 
computer's memory by the use of the special variable INPUT, 
which differs from other variables in that it has the 

f-r»11/*Li-inf-r nf~/-\n^T~-*-T»» T.«U^*<tA«*AM a. \^ ^ ..^^..^ ^ tr ii • « -. 

. — L.j.^~^.i^ f^*. vj(;^v; i. 1. ; . wnmiKvcL ciitj V a i u « or. tne variaoj.e inimit 
is needed for the execution of a statement, INPU'^ acquires 
for its value the next record of the input file. For 
example, in the assignment rule 

LINE = INPUT 

the value of INPUT is needed, so it can be assigned as the 
valae of LINE; LINE receives, as its value the string of 
characters in the next input record. 

It is important to recognize that the value of INPUT 
cannot be saved or used without assigning it to another 
variable in the same rule in which it is read. The next use 
of INPUT will refer, not to its present value, but to the 
next record of the data. Thus the sequence 

LINE1 = INPUT 
LINE2 = INPUT 

assigns two successive records to the two variables LIN^I 
and LTNE2. 

This example illustrates an important difference 
between the variables INPUT and OUTPUT: INPUT displays its 
special property (to acquire the next record of an input 
file as value) every time its value is needed, but not when 
it is assigned a value; OUTPUT displays its special property 
(to^ write a record on an output file) every time it is 
assigned a value, but not when its value is needed. Thus the 
last value assigned to OUTPUT is always available for 
assignment to another variable. 

The special variables INPUT and OUTPUT may both be used 
in a single rule, as in 

OUTPUT = INPUT 

Execution of this rule will cause the characters of the next 
data record to be printed by the line printer. Repeated 
execution of such a rule could be used to make a print^^d 
listing of an entire group of data (as will be shown in 
Chapter 3) . 



2A. Assignment 



The value of INPUT is always 80 charactet-s long, a 
convention adopted since that is the width of a card and of 
lines sent from many remote terminals. If the record beinof 
read actually has more than 80 characters, the excess is 
ignored; if it has fewer than 80 characters, spaces are 
added at the end to fill out the full length. Exeoiting the 
rule 

VOWELS = INPDT 

where the next data record has the five vowel characters 
starting in the first position, causes the vari?ihle VORELS 
to be assigned a string consisting of the 5 characters AEIon 
followed by 75 spaces, 

Other_Forros_of TnEut_and .Output^ The input to a Snobol 
program "may exist~in"the form of punched cards or it may be 
stored on a disk file or on magnetic tape. The output from a 
program may be printed on paper, punched on cards, or 
written on a disk file or on magnetic tape. Snobol provicies 
the special variable INPUT for reading cards and the special 
variable OUTPUT for producing printed paper, but provides no 
other special variables for dealing with the other input and 
output devices listed above. If the programmer wishes to use 
these other media, he must cause a variable to be associated 
with a file for input or output, and then use that variable 
much as INPUT and OUTPUT are used within his program. 
Methods of associating program variables with input and 
output files are described in Appendix A, section IT.D. 

Procedures^ The small amount of Snobol so far presented 
allows"~one to enter data into the computer's memory (eithor 
by writing it directly in the program text in the form of 
string and integer literals or by using the special variable 
INPUT) and then to print it out (using the special variable 
OUTPUT). However, it is seldom the case that the output is 
to be the same as the input; that is, some manipulation of 
the data is usually necessary before the desired results can 
be obtained. One way of manipulating the data is to invoke 
what is termed a procedure. Many procedures to perform 
common tasks are already predefined in the Snobol language; 
a summary of all the predefined procedures which are 
available may be found in Appendix A. Resides using these 
predefined procedures, programmers may define their own 
procedures and add them to the language within their own 
programs (see Chapter 6). 

A procedure is invoked, or called, by writing a 
procedure reference consisting of the name of the procedure 
followed directly by its argument list enclosed within 



2A. Assignment IS 



parentheses. This means that the Snobol system is to perform 
the action cf the procedure, using its one or more arguments 
as flata, and is to return the result of carrying out the 
action as the value of the procedure call. 



INP 



l]l§_2EI^iI_PE2£§<lilI2'. The use of the special variable 
UT almost always results in strings which have spaces at 

L-iit^ cti>j %m/i_ viti.7iu« ,-i ^ iiK^Ki K. kkvr..^Kz c»]-'av«r;^3 ai.^ v./ i. cell ii^L- Kdiiucru^ d 

TPIM() procedure is provided by Snobol which accepts any 
expression whose value is a string as its single argument; 
the procedure returns as its value the same string but with 
all trailino spaces removed. Thus those 75 unwanted snaces 
which occur in the value of VOWELS when the rule 

VOWELS = INPUT 

is executed may be trimmed off by using the rule 

VOWELS = TRIM (INPUT) 

instead. This would give VOWELS the five-character value 
AEICU. 

When the rule 

VOWELS = TRIf1{TNPUT) 

is executed, the eighty-character value of INPUT (the next 
record) is obtained, the trailing spaces are removed from it 
by the TRTM() procedure, and the shortened string is 
returned as the value to be assigned to the variable VOWKLS. 

Although the TRIM{) procedure is most often used to 

trim the value of INPUT, it may be used to return the 

trinmed value of any string given as its argument. For 
example, in the rule 

TEXT1 = TRIM(TEXT2) 

the call to the TRTH() procedure returns the trimmed version 
of the string which is the value of TRXT2, to be assigned to 
the variable TEXT1. The value of TKXT2 remains unchanged: 
that is, it still contains any trailing spaces it had when 
the rule was executed. To trim TEXT2 one could use the rule 

TEXT2 = TRIM(TEXT2) 

Note that although variables and procedures may have 
the same names, there is no confusion in their use in 
program texts, since procedure names are always followed 



2A. Assignment '° 



immediately by an open parenthesis preceding the argument 
list. Thus one may write 

TRin = TRIM (TEXT) 

to assign to the variable TRIM the trimmed value of TEXT. 

The .SIZE (). Procedure. The length of any string may be 

determined by a" SIZE() procedure, which accepts any 

expression whose value is a string as its argument; the 

procedure returns as its value an integer which is the 
number of characters in that string. That is, executing 

LENGTH1 = SIZE (VOWELS) 

would assign to LENGTH! the integer value 5, while executing 

LENGTH2 = SIZE (INPUT) 

would assign to LENGTH2 tho integer value 80. When the 
argument of SIZE () is a null value, the result is the 
integer value zero. 

The length of the trimmed value of INPUT may be 
determined by using the procedures TRIM() and 5IZE() 
together. This may be done by using the two procedures in 
two different assignment rules, such as 

SAVE = TRIK (INPUT) 
LENGTH = SIZE (SAVE) 

or, if the value of INPUT were not to be saved but only its 
length, by combining both procedures in a single assignment 
rule, such as 

LENGTH = SIZE (TRIM (INPUT)) 

Here the argument of a procedure reference is still another 
procedure reference; clearly, these nested procedure calls 
must be processed from the inside out, since the argument of 
SIZH() is not known until TRIM() has returned the result of 
its work. AS this example shows, an argument of a procedure 
reference may be any expression which produces a value the 
procedure is able to accept, 

0£2rators. Data may also be manipulated by means of a 
number of" different operators provided within the Snobol 
language. Each operator specifies that some sort of 
operation is to be performed on its operand (s). Operators 
having a single operand are termed unary operators; 



2A. Assignment 17 



operators having two operands are termed binary operators. 
Often the same symbol is used in program texts to indicate 
both a unary operator and a binary operator with different, 
perhaps completely unrelated, meanings. The meanings are 
easily differentiated, however, since a unary operator must 
always directly precede its operand with no intervening 
blank; a binary operator must always be bounded by blanks. ^ 
summary of all the operators available in Snobol may he 
found in Appendix C. 

lilg^goncatenation^^ Operator^ One of the most frequently 
used operators is the concatenation operator. When the 
operands of this binary operator are strings, it specifips 
that the two strings are to be concatenated together, i.e., 
that the second string is to be appended directly to the 
first. The symbol for this binary operator, since it occurs 
so often, is simnly a single blank (which reguires, 
therefore, no furth^^r blanks to separate it from its 
operands). For example, the assignment rule 

ALPHA = VOWELS CONSONANTS 'YW 

contains two concatenation operators and specifies that th^ 
variable ALPHA is to be assigned a string built up by taking 
the value of VOWELS, followed by the value of CONSOHANTS, 
followed by thp two characters YW. Tf the variables vnwELS 
and CONSONANTS have previously been assigned the expected 
values, then the variable ALPHA will be assigned the value 
of all the characters of the alphabet, in the indicated 
order. The values of VOWELS and CONSONANTS are in no way 
changed by the execution of this rule; likewise, siibseguent 
changes in their values can in no way affect the value of 
ALPHA, which will change only when another rule specifyinq 
an assignment to ALPHA is executed. 

The variable appearing to the left of the assignment 
sign may be used within a concatenation on the right as 
well, as in the rule 

VOVrELS = VOWELS 'YW* 

This rule appends the characters YW to the string which is 

the current value of VOWELS and then assigns this resultina 

string as the new value of the variable VOWELS. The old 
value of VOWELS is thereby lost. 

Rules of this form are often used to collect successive 
characters in an increasingly long string. Execution of the 
rule 



2R. Assignment '^^ 



LIST = LIST NEWCHAP 

would cause whatever new character is the value of NEWCHAR 
to be appended to those already referred to by the variable 
LIST, and the re-assignment to the variable LIST of this 
longer string. If LIST had a null value, as it easily might 
the first time the rule was executed, then it would simply 
be assigned the same value as that of NEMCRAP; the 
concatenation would indeed take place as specified but there 
would be no evidence that it had occurred since the null 
value contributes no characters to the string. 

Kote that no spaces are generated by the concatenation 
process itself. That is, the new characters are appended to 
the list in the example above in a contiguous fashion with 
no intervening spaces. If spaces are desired in the result 
of a concatenation, they must themselves be concatenated 
into the string, as in the sequence 

OUTPUT = 'AnPOSE* 

OUTPUT = OUTPUT 'alSn' OUTPUT 'olSa' OUTPUT 

whose execution will produce the following output: 

A ROSE 

A ROSE IS AROSE IS A ROSE 

Here complicated Snobol expressions may be operands of 
the concatenation operator; for example, the TRIM () 
procedure may be used to produce a heading, as in 



or 



OUTPUT = i******a» TRIM (INPUT) »a******' 

HEAD = TRIMCINEUT) •a« TRIM(INPUT) 'a' TRIM (INPUT) 



This last rule specifies that the next three data records 
are to be read, their trailing spaces (if any) trimmed off, 
and a single space placed between the trimmed content of 
successive records. The resulting string is then assigned to 
the variable HEAD by which it may be referenced in other 
statements of the program. 

If an integer literal is involvc-d in a concatenation, 
it contributes the string of digits representing its numeric 
value. Thus 

SUBST = VOWELS 00a6 

and 

SUBST = VOWELS • '! 6 • 



2A, Assignment ^ g 



produce the same string as the new value of SUBST, nampi v 
AFicuae. ~ ■ ^^^ 

IIiS_A£iilli!ietic_0£Grators^ Four binary operators are 
provifled within Siioboi for doing the four basic arithmetic 
operations of addition, subtraction, multiplication, and 
division. The symbols used to represent these operators in 
the program text are as follows: 

addition f 
subtraction 

.•multiplication * 

division / 

Since these are binary operators, they roust always br 
bourded by blanlcs. 

The assignment rules 

ANSWER = 669 + 527 

ANSWER = ((A * B) - (C * (-D))) / E 

AHSW'i^R = (SUM1 / SUM2) + 3 

would all ar5sign an integer value to the variable A'JS'^EP, 
provided the variables to the right of the assignment signs 
all refer to va:iues of datatype Integer when the rules arr- 
executed. 

Repeated executions of rules of the form 
COUNT = COUNT + 1 

are often used to count the number of times a given ev^nt 
occurs. These rules are in some ways analogous to ones of 
the form 

LIST = LIST NEWCHAR 

which cause a new character to be appended to the value of 
LIST; here a new integer, one larger than its predecessor, 
becomes the value of COUNT. If COUNT had a null va] ue wh^n 
the rule was executed, it would acquire the value 1 since 
the null value is considered equal to zero when it is an 
operand of an arithmetic operator. 

The operands of arithmetic operators must always he 
numeric; that is, they must be any expressions whose values 
are integers, real numbers (numbers containing decimal 
points), or null. Real numbers and inteaers, however, mav 
not occur together within the same arithmetic expression 



2A. Assignment 20 



(i.e., mixed mode arithmetic is not allowed). Further 
infcrnation on Snobol arithmetic, including facts about real 
numters, conversion of integers into real numbers and real 
numbers into strings, truncation on division, etc., may he 

found in Appendix ♦?. 

A Coin£let.e Snobol Proaram lext^ Riven below is a 

complete program "text which makes use of only a few of the 
features of the Snobol language already described: it 
employs only assignment, concatenation, and the special 
variable OUTPUT: since all data is provided within the 
program text, the special variable INPUT is not needed. 
Comments have been inserted in the program text before some 
statements to indicate their purpose; a comment is 
distinguished by having an asterisk {*) as its first 
character. Instructions for representing program texts on 
punched cards may be found in Appendix H, 

* PBOGPAM TO PRINT A PABTTCULAR DESIGN INVOLVING FISH 

* SET UP THE BASIC COMPONENTS 

LT = •<• 
GT = ♦>• 
BLU = 'nnnn' 
BL10 = BLU BLU 'an* 
* 

* BUILD FISH WHICH SWIH LEFT, StflK RIGHT, AND HATE 

LFISH = LT GT LT 
T^FISH = GT LT GT 
MFISH = LFISH GT 
* 

* BUILD LONGEH STRINGS COMPOSED OF DIFFERENT KINDS OF FISH 

LSWTM = LFISH BLU LFISH BLU LFISH BLU LFISH BLU 
RSWIM = RFISH RLU RFISH BLU RFISH BLU PFISH BLU 
MSWIM = MFISH BL10 MFISH EL10 HFISH BL10 MFISH 
SCHOOL = PSHIM LSKIM 
* 

* PRODUCE FOUR LINES OF OUTPUT 

OUTPUT = RSWIM RSMIM 

OUTPUT = LSWIM LSVJTM 

OUTPUT = SCHOOL 

OUTPUT = MSWTM 



END 



><> 
<>< 
><> 



Output from this program is the design shown below. 

><> ><> ><> ><> ><> ><> ><> 

<>< <>< <>< <>< <>< <>< <>< 

><> ><> ><> <>< <>< <>< <>< 



<><> <><> <><> <><> 



21 



-lA. THE PLOW OF CONTROL 

The statements which make up a Snobol proQram aro 
seldom designed to be executed in the order in which they 
are written in the program text. Instead, certain segments 
of the program, consisting of one or more statements each, 
are intended to be executed repeatedly until somp 
terninating condition is encountered. This condition may be 
that a certain pattern of characters has occurred in the 
data, that the data group is exhausted, that the segment has 
been executed a certain number of times, etc. Once the 
teririnating condition has been met, then repeated execution 
of another such segment, or "loop," may begin, the choice of 
the particular segment to be executed can be made dependent 
on certain features of the data being processed, so the uso 
of tlie same program with different data will often result in 
the execution of a different set of statements from within 
the program. The actual order in which the statements of a 
program are executed is called the "flow of control." 

The flow of control is specified by means of labels 
which are given to statements for purposes of reference, a^.il 
ty ireans of go-to' s which indicate the statement to h^< 
executed next by making reference to its label. The label of 
a statement is written to the left of its rule, and the go- 
to is written to the right, as in 

ASSIGN VOWELS = •AEIOU' : (NEXT) 

Here the label of the statement is ASSIGN', the rulo 
specifies an assignment, and the go-to specifies that the 
next statement to be executed after this assignment takes 
place is the one labelled NEXT. If the go-to part of a 
statement is absent, it is understood that control flows by 
default to the following statement of the program. 

ia^els^ Any statement may be given a label so that it 
may be referred to by other statements of the program, or 
simply by the programmer for his own convenience. A label 
must always be an identifier and should be chosen so as to 
be mnemonically useful. Care must be taken when givina 
statements labels to see that the same label does not occnr 
twice within a single program, or a oompile-time error will 
occur. 

Labels are distinguished from the names of variables in 
a Snobol statement by their position. A label, if present, 
must always start in the first character position of ^ 
statement and must be separated from the rule, if present. 



3A. The Flow of Control 22 



by one or more blanks; if a statement is not labelled, the 
rule must begin with a blank. Because they are aistinguishecl 
by position, labels ana variable names of the same form may 
be osed freely together without confusion, as in 

VOHFIS VOWELS = VORELS 'YW 

which is a statement labelled VOWELS, whose rule specifies 
that the variable named VOWELS is to have the characters YW 
concatenated to its value. 

It is sometimes convenient to write a statement which 
consists solely of a label, as in 

REAE 

since this makes subsections of the program text easy to 
locate and makes modifications simpler. 



Go-to^s. The presence of a go-to within a statement is 
signalled by the occurrence of a colon which serves as an 
explicit separator between the go-to and any other part of 
the statement which r-ay have preceded it. Following the 
colon (which may optionally be bounded by one or more 
blanhs) the information as to which statement is to be 
executed next is provided by writing the label of that 
statement within parentheses. For instance, the statement 

: (TEST) 

consists of a go-to only (it has no label and no rule) and 
specifies that the next statement to be executed is the one 
labelled TEST. 

Usually a go-to follows a rule, as in the statement 

VOWELS = TRIK(INPnT) : (TEST) 

which specifies that after the assignment is performed, the 
next statement to be executed is the one labelled TEST. 

The form of the go-to's just shoun is called 
unconditional, because execution of the statement in which 
they occur will always cause a transfer of control to the 
statement labelled tt^st. nore commonly, go-to's are 
conditional upon the possible failure of the rule which 
precedes them in the same statement. This causes a choice, 
or branch, to occur in the flow of control and allows tlr- 
data to determine which path through the program will be 



3A, The Flow of Control 23 



followed next. (Ways in which rules may fail will be 
indicated presently.) 

Conditional go-to«s are written like unconditional go- 
to* s, with the addition of a prefixed F (for failure) or S 
(for success) . The statement 

TEST LINE = INPUT : F (WPITE) 

specifies that control be transferred to the statement 
labelled WRITE only if the rule LINE = INPOi^ fails. 
Similarly, the statement 

TEST LINE - INPUT ■ : S(READ) 

specifies a transfer to the statement labelled READ unlcrrs 
the rule fails (i.e., if it succeed.s) . In either -stateineut, 
if the condition for transfer is not met, control will pass 
by default to the next statement of the program. Thur; a 
conditional go-to always embodies both a success and a 
failure transfer, even though one of ^.hem may he expresscl 
Implicitly rather than explicitly. Fioth a success an"! <i 
failure transfer may be written explicitly in a single 
statement as in 

TEST LINE = INPUT : F (WRITE) S(READ) 

Since both cases are provided for explicitly, control will 
never pass to the following statement by default. The orrier 
of the success and failure transfers is immaterial and tho 
space between them is optional; the only important 
requirement is that no blank may intervene between an F or 
an S and its following open parenthesis. 

I]lS_5l2£cial_Transf er_END^ A go-to specifying a transfer 
to END is used to terminate execution of a program. This 
transfer has a special system definition, and constitutes a 
request to the Snobol system to stop executino. Aji y number 
of statements in a program may contain go-to's specifying 
transfers to END, and the first such transfer to be taken 
ends execution of the program. 

An alternative way of terminating execution is to 
execute the statement which stands last in the program text, 
without taking a transfer from it back to some othor 
statement of the program. 

There is no re.s t riction against using END as the laboi 
of any statcmont of the program text, but :f thi.s is don-^ 
its special system definition is lost. The convf^ntion 



3A. The Flow of Control 2U 



adoptea here is to terminate every program text with a 
statement consisting solely of the label 

END 

A transfer to END causes this last statement to be executerj 
ana the flow of control continues on to the next statement; 
since there is no next statement, the proqraia terminates and 
the effect is the same as if the system definition of END 
had not been overridden. 

Failure of the_gulg». Failure of the rule is not an 
error "and "does~not~cause execution of the prograro to cease. 
Rather, it is used to direct the flow of control and to 
prevent the rule which has failed from continuing execution. 
When a rule fails, control is sent immediately to the qo-to 
part of the statement so no further processing of the rule 
is undertaken; in particular, the assignment specified by an 
assignment rule does not occur. If the statement in which 
the failure occurs has no go-to, control passes by default 
to the next statement of the program; if the go-to is 
conditional (as would usually be the case) the failure 
transfer, expressed explicitly or implicitly , is taken; if 
the go-to is unconditional, this unconditional transfer is 
used. 

Failure of INPUT. There are a variety of ways in which 
a rule" can fail? Of the rules presented so far, however, 
only those which call for the reading of data — those in 
which the value of INFOT is needed — have any possibility 
of failing. Such a rule will fail when an end-of-group 
record is read, i.e., when there are no more data records in 
the group to become the new value cf INPUT. The ability to 
test for an end-of-group mark, and to direct the flow of 
control if it is encountered, makes it. possible to specify 
that seme process is to be performed on all the records o^ a 
data group without having to specify how many records that 
might be. For example, all the records of a data group, no 
matter how many there are, may be printed by executing the 
following very simple complete program text. 



RFAE OUTPUT = INPUT : S (READ) 

END 

Every time the statement labelled REPD is executed, 
INPtJT acquires the value of the next data record. If that 
value is not an end-of-qroup mark, it is assigned to the 
variable OUTPUT and hence printed. Since the rule has not 
failed, control is sent back to READ and the process is 
performed again. This single statement, a ono-stateraent 



4= 4Uyy P 
Ju \- 11 t- J.. 


J-i. o 


1 . v^ i. w vi p • 




ODTPUT 


~ 


INPUT : 


S(PEADI) 


OUTPUT 


~ 


•HNDnOFnGBCUPnONE. ' 




OUTPUT 


- 


INPUT : 


F(END) 


OUTPUT 


= 


NULL : 


(READ2) 



3A. The Flow of Control 



loop, will be executed repeatedly until the end-of -group 
mark is encountered, causing the rule to fail. In this car.o 
the assignment will not take place and the value of OUTPUT 
will remain unchangedr Control will then flow by default to 
the statement labelled END, teriainating the proqrani. 

More than one data group may be processed by a single 
program since the reading of an end-of-group mark does not 
prevent further reading of data. The following program text 
prints two data groups, the first in single-spaced format 
(as above) and the second in double-spaced format (with a 
blank line following each record) . It prints a message at 



READ1 

REAC2 

END 

The one-statement loop labelled READI fails when INPUT 
acquires the value of the first end-of-group mark, but the 
next use of INPUT (in tlie two-stn tement loop starling af 
READ2) causes it to acquire the value of the first data 
record in the second group. Eventually a failure of TNPfl'f 
will occur in this statement as \'ell, when a second end-of- 
group mark is read, sending control to END and thus 
terninating the program. 

Z Y s? \. ^ ^. \ i 9. )\ ... •! M 1 g g •_ ^ rule in a program text consisting 
of a single expression only is called an evaluation rule. 
The statement 

INPUT : F(DONE) 

consists of an evaluation rule and a go-to. When such a 
statement is executed, the single expression of the rule is 
evaluated, often causing success or failure of the rule to 
be determined; then the go-to part of the statement, if any, 
is processed. The statement above indicates that a record is 
to be read from the input file, and a transfer taken to DON^^ 
if that record is an end-of-group mark. No provision is made 
for preserving the data which is read, but there are some 
applications in which the data is not needed. The two 
complete program texts below provide examples of such 
applications: the first is a program to count the number of 
records in a group and to print the result; the seconr) 
prints every other data record in a group, starting with the 
second record. 



The Flow of Control 26 



♦ PROGRAM TO COUNT THE NUMBER OF RECORDS IN A GROUP 
READ INPUT : F(DONE) 

COUNT = COUNT ♦ 1 : (READ) 

DONE OUTPUT = COUNT 'nRECORDS' 
END 

* PROGRAM TO PRINT EVERY OTHER RECORD STARTING WITH THE 2ND 
READ INPUT : F(END) 

OUTPUT = INPUT : S(READ) 

END 

Evaluation rules are commonly used to direct the flow 
of control through failure of the rule; they can also be 
used to cause a variable to have a special input or output 
association attached to it, to define a new procedure, etc. , 
in ways to be described later; in these cases failure of the 
rule is not involved. 

Test Procedures. Failure of the rule may also be caused 
by the" failure' of a procedure call which occurs within tho 
rule. Snobol provides nine predefined procedures, called 
test procedures, which are used primarily to direct the flow 
of control. Each test procedure accepts two arguments and 
tests to see whether or not some specified relation, such as 
equality, holds between them. If the test succeeds, the tP5^t 
procedure returns the null value and execution of the rulp 
continues. If the test fails, the rule of which it is a p.irt 
fails as well and control is sent immediately to the no-to 
part of the statement where the failure transfer will be 
taken. 




single~character, even though the null value 
equal to zero when used in arithmetic operations. IDENT() 
and DIFFER perform exactly the same test hut return 
opposite resalts: IDENT () fails if its two arguments arc not 
identical, while DIFFER () fails if its two arguments arn 
identical. Thus the following statements are equivalent: 

IDENT (STRING1,STRING2) : S(SAMK) 

DIFFKR(STr<ING1,STRING2) : F(SAHE) 



3A. The Flow of Control 27 



Spaces, of course, must be considered as any other 
character in the data, so if the rules 

STRING1 = •KINGnLEAS' 
and 

STEING2 = 'KINGaLBARa» 

had just been executed, the rule with IDENTf) above would 
fail while the rule with DIFFER() would not. 

Tt is often important, for reasons which will be 
indicated presently, to know whether or not a qiven variable 
has a null value. This can be determined by the execution of 

IDENT (STRING,' •) : S (EMPTY) 

or 

DIFFER (STRING, MULL) : F(EJIPTY) 

or something similar. Since any missing argument of a 
procedure reference is assumed to be null, the simplest (if 
not perhaps the clearest) way to write the above statement 
is in the form 

IDENT (STRING) : S (EMPTY) 

lllS_l£§t_rE2£5d2ES_LiiIiIi Lf^T () conpares two strings to 
determine whether or not the first is "Lexicographically 
Greater Than" the second — that is, whether the f.l.rst 
ISllSiif +-^6 second in alphabetical order. For example, thf 
sequence 

STR1 = 'ABB* 
STP2 = 'ABC 
LGT(STR2,STB1) : S(WRITE) 

will send control to WRITE since AEC alphabetizes after ABR. 

The string values being compared may be of any length 
and may be composed of any characters; the "alphabetic 
order" of non-alphabetic characters is determined by the 
order of the computer's character set (see Appendix I) . 
Although the character "space" has special significance in 
most written languages, it is treated as any other character 
by the computer, so its relative position within the 
character set ipust be taken into account when alphabetizing 
material containing spaces. 

If either of the values being compared by LGT() is not 
a string, an execution-time error will result. 



3A. The Flow of Control 



20 



Arithmetic; lest ProcedureSi The remaining six 

predefined "test procedures compare two numeric values for 
the following arithmetic relationships: 

proced ure relationship 

EO(X,Y) X equal to Y 

NE(X,Y) X not equal to Y 

I.T(X,Y) X less than Y 

LE(X,Y) X less than or equal to Y 

GT(X,Y) X greater than Y 

Gj;(X,Y) X greater than or equal to Y 

* 

All these procedures fail if the indicated relationship does 

not held. 

IQO and NE{) are very sinilar to IDENT () and DIFFER (), 
except that here arithmetic identity, rather than character 
for character identity, is required. Thus EO (23, • +00023' ) 
will not fail since both arguments have the numeric value of 
23, while IDENT (23 ,' +00023 • ) will fail since character for 
character identity cannot be found between two strings of 
different lengths. The expression EQ(NTILL,0) succeeds since 
the null value and -zero are arithmetically identical. 

If either argument of an arithmetic test procedure has 
a non-numeric value, an execution-time error results. 

Test_^rocedures_withi.n_Rssicinment_Rulos^ Any number of 
references to test" procedures may be embedded within the 
riqht-hand side of an assignment rule where they are used 
not only to direct the flow of control but also to determine 
whether or not the assignment is to be executed. For 
example, the statement 

STPTNGI = IDENT (STRINGI, NULL) STRING2 : F (SKIP) 

specifies that STRINGI is to be given the value of STRTHG2 
only if STRINGI has a null value when the rule is executod. 
If it is non-null, then the IDENT() procedure will signal 
failure, sending control to SKIP before the assignment takes 
place, so the value of STRING1 will remain unchanged. 

Several arithmetic test procedures may be ur.ed in 
conjunction with on« another to specify a range of 
acceptable values. The following rule for example, allows 
the printing of a record having from 2 to 10 characters 
only. 



3A. The Flow of Control 29 



OOTPCT = GE (SIZE (EEC) ,2) LE (SIZE (REC) , 10 ) RRC 

If either of the test procedures signals failure, no output 
is produced. 

The following single statement employs two references 
to test procedures to specify that a transfer is to be taken 
to L00P2 if the value of N is either or 1? if N has 
neither value, then whatever value it has is increased by 1 
and control flows by default to the next statement. 

N = DIFFER (N,0) DIFFER (N,1) N + 1 : P (L00P2) 

The desired condition here is that the value of N be 
either or 1 , so there is no need to differentiate the two 
cases. However, it is often necessary to know which part of 
the rule has signalled failure and to take different 
transfers accordingly. Consider, for instance, the problem 
of giving STRING, if it is null, the value of the next data 
record. The statement 

STRING = IDENT (STRING) TRIM (INPUT) : P (SKIP) 

will send control to the statement labelled SKIP if STRING 
is non-null but also if an end-of-group record is 
encountered, making no differentiation between the two 
cases. Dif<-"erent transfers will usually be needed for these 
two situations, so in this case it will be necessary to 
express the process in two statements, each having a failure 
transfer, such as the following: 

NEXT = TRIH(TNPUT) : F (DONE) 

STRING = IDENT (STRING) NEXT : P(SKIP) . 

The placement of a reference to a test procedure within 
the right side of an assignment rule implies that the value 
which the procedure returns is .to be concatenated with any 
other right-side values before assignment occurs. All test 
procedures return null values, so the result of such 
concatenation is never visible; the null value concatenated 
with any other value leaves that value unchanged. 

ico^s^ Any useful program will contain at least one 
(and usually many) loops which are to be executed repeatedly 
until some terminating condition is encountered. These loops 
may consist of any number of statements (they are typically 
longer than the one and two-statement loops which have been 
the only examples presented so far) , and may overlap or be 
nested within one another. The terminating condition may be 
that an end-of-group record is read (as in the earlier 



:?A. The Flow of Control 



3 



examples), that some other feature of the data is 
encountered, or that the loop has been entered a certain 
nunher of times. Every time a loop is entered it is 
necessary to perform some test, often with the use of a test 
procedure, to determine whether or not the terminating 
condition has been met; if it has, control is sent out of 
the loop to some other part of the program. If the test is 
accidentally omitted, or set up wrongly, then there may he 
no way to leave the loop and the set of statements of which 
it is composed will be executed repeatedly until the program 
is terminated by the computer's operating system. When this 
happens, the program is said to be in an "infinite" loop. 

I02£s_Controlled_bjr_nata ConditionSj^ The terminating 

condition for a~loop may~be that a record of a certain form 
is encountered in the data. If this record is an end-of- 
group nark, then the test for its existence can be made by 
sitnfly providing a failure transfer on a statement in which 
the value of INPUT is needed. However, it is often useful to 
divide the data into "subgroups," each of which is 
terminated by a record having a special pattern of 
characters, such as one consisting of asterisks as the first 
six characters, followed by spaces. If each subgroup is to 
be processed separately, then a test must be made for this 
special signal each time a record is read, and a transfer 
taken accordingly, 

IDENTO or DIFFER can be used to make this kind of 
test. For example, the following program segment reads and 
prints all data records until one with asterisks as the 
first six characters and no other non-space characters is 
encountered; when that record is read, control is sent to 
STflPS which may be the initial statement of another loop. 



READ RECORD = TRIM (INPUT) 
IDENT (RECORD, •******•) 
OUTPUT = RECORD 



F(ERROR) 

S (STARS) 

(READ) 



Note that provision is made for the possibility that a 
record consisting of six initial asterisks will not be found 
in the group, i.e., that the program is processing the wrong 
data. This condition may be treated by transferring to a 
statement labelled ERROR when an end-of-group mark is read. 
Here an appropriate error message may be written and control 
sent either to FHD or to some other part of the program, 
depending on the sort of tasks which still remain to be 
done. TC such an terror exit were net provided there might he 
no indication frow the program that anything was wrong, and 
it wight attempt the processing of many groups of erroneous 
data. In any event, the program has entered an infinite loop 



3A. The Plow of Control 



31 



since it is persistently seeking 
which will never be found. 



a terminating condition 



are o 
to be 
proqr 
will 
Using 
to pr 
prcgr 
ccntt 
can I 

LOOP 



lOOfiS 

ften 

ente 
an; 
be th 
the 
int 5 
am. 
ol is 
e pri 



Contc 
used t 
red be 
that i 
at it 

data 
(If t 

sent 
nted. ) 



olled_b^_Counts^ A 
o control the numb 
fore control is se 
s, the terminating 
has been executed 
procedure, for exa 
records, and then 
here are less t 
to EPROF where an 



OnTPOT = INPUT 
COUNT = COUNT ♦ 1 
EQ (COUNT, 5) 



rithmetic test 
er of times that 
nt to some other 
condition for s 
a given number 
mple, one may wr 
go on to the res 
han 5 records t 
appropriate erro 



F (ERROR) 
F (LOOP) 



procedures 
a loop is 
part of a 
uch a loop 
of times, 
ite a loop 
t of the 
o be read, 
r message 



A similar loop may 
procedure and embedding 
rule, as follows: 



be written by using the LT() 
it within the second assignment 



LOOP OUTPUT = INPUT 

COUNT = LT (COUNT, U) 



COUNT *■ 1 



F(ERROR) 
S (LOOP) 



In this segment it has been necessary to use U 
test value rather than 5 since the procedure 
executed before the value of COUNT is incremented, 
than after as in the earlier example. In both segments, 
COUNT is assumed to have the null value when the segment 
executed for the first time. 



as the 

call is 

rather 



IS 



Information as to the number cf times that something is 
to be done may be found on a data record or computed durinq 
the course of execution, rather than being written directly 
into the program text. For example, the following segment 
would cause the LOOP to be entered as many times as there 
were characters in each data record that it was processing. 



READ RECORD = TRIfl(INPnT) 

N = SIZE(FECORD) 
LOOP N = NE(N,0) N - 1 

[series of statements to process 



F(ENDDATA) 

F(READ) 
record ] 

(LOOP) 



Here the test has been placed at the beginning of the 
loop instead of at the end, and the counting has been done 
by subtraction rather than by addition. It might seem 
clearer and more intuitive to perform the process first and 
to test for the terminating condition afterwards (as in thr> 



^h. The Flow of Control 3 2 



two previous examples). For instance, the program text 

REfiE RECORD = TRIH{TNPOT) : F(ENDDATA) 

N = SIZE (RECORD) 

LOOP [series of statements to process record] 

N = NE(N,1) N - 1 : S(LOOP) P (READ) 

might seem to be eguivalent to the one given above, in the 
sense cf alvays producing the same result. An examination of 
the case of a one-character record shows that the program 
appears to work properly. In this case it would perform the 
prdtess once, find that N was equal to 1 and then leave the 
loop correctly by transferring to READ and reading in the 
next record. 

The difference between the two programs becomes 
apparent when one attempts to process a record consisting 
soltly of spaces which when trimmed becomes null. The 
program which tests before processing will handle records of 
size zero appropriately by failing ttie first time the loop 
is entered and returning immediately to read the next 
record. The program which processes first and then tests 
will pernor in the process once (erroneously) and then will 
test to see whether the value of N is equal to 1. Since it 
is zero, the value of N will be decreased by 1 to become -1 , 
and ccvntrol will be sent back into the loop so the process 
will be performed again. Henceforth the value of N will 
never equal 1, but a series of constantly decreasing 
negative numbers. The terminating condition will thus never 
be icet and the program has entered an infinite loop. 



3 3 



UA. PATTERN MATCHING 

The process of searching a string of characters to 
determine whether or not it contains one of a specified set 
of strings is called pattern matching. The pattern being 
sought may be something very particular, such as a certain 
character or a certain number of characters, or it may be 
something cuch more general , such as one of a choice of 
characters or all characters preceding one of a choice of 
characters. Like calls to test procedures, pattern matches 
either succeed or fail, causing the rules in which they 
occur to succeed or fail as well. Thus pattern matching may 
be used to direct the flow of control. 

Iil£j^§.liSI!llfla.±china_Pule_. The pattern-matching rule 

consists of two main parts: the string reference, whose 

value is to be searched, and the pattern. These two parts 

must be separated in the program text by one or more blanks. 

The very simple pattern-matching statement 

VOWELS 'E' : S(YES) 

specifics that the current value of VOWELS is to be searched 
for an instance of the character E, and that a tr^^nsfer i.s 
to be taken to the statement labelled YES if the search is 
succej^sful. If the search fails, then control will flow by 
default to the next statement of the program. Whether th<^ 
search succeeds or fails, the value of VOWELS is in no way 
affected. 

The pattern part may be in the form of a variable, 
rather than a literal, and may have a value consisting of 
more than one character. For example, the seguence 

PAT ^ 'lOU* 

VOWELS PAT : S(YES) 

specifies a search through the value of VOWELS for the 
three-character string lOU. This pattern match will succeed 
(if VOWELS has the value AEIOn) with the third, fourth, and 
fifth characters of the string reference being matched, and 
control will be sent to YES. 

The search for the pattern always begins with the first 
character of the string reference and continues through the 
rest of the string from left to riqht until either a match 
is found or all characters have been tested. Note that it 
the first statement above had read 



4A. Pattern Hatching 3a 



PAT = 'OTII' 

the search would have failed. The characters OUI are indee'^ 
present within the string reference, but not in the 
indicated order. 

The string reference part of a pattern-matching rule 
may be any expression which gives a string when evaluated. 
Thus executing the statement 

TRIM (TEXT) 'DTnEn* : S(Yi;S) 

will cause the expression TRIM (TEXT) to be evaluated, and 
its value to be searched for an instance of the word THE, 
surrounded by spaces. Similarly, the use of the variable 
INPUT within the string reference will cause it to acquire 
the value of the next data record, since this value will be 
needed for the execution of the statement. A statement of 
the form 

TRIM(INPDT) 'uTHEa' : S (YES) 

however, is not likely to be useful since (1) the value of 
INPDT has not been assigned to another variable and hence 
will be lost, and (2) no distinction is made between failure 
of INPUT and failure of the pattern match. 

The ReElacement_Rulej. The replacement rule specifies a 
pattern which is to be sought in the string reference, and 
also a replacement for that part of the string which is 
matched by the pattern if the search is successful. For 
example, the replacement statement 

WORD 'A* = «Y» : S(FOONDA) 

specifies that the character A is to be sought within the 
value of WORD and that the first A which is found, if any, 
is to be replaced by a Y. This new string, with Y in place 
of A, is stored within the memory and assigned to the 
variable WORD; the old value of WORD is lost. 

Note that the search succeeds, replacement occurs, and 
control is sent to the go-to part of the statement as soon 
as the first (leftmost) instance of the pattern is found, so 
successive instances of the pattern remain unfound and 
unaltered. In order to change, for example, all A's within a 
stirinq reference to Y's, one would write a loop of the form 

SELF WORD 'A' ^ 'Y* : S(SELF) 



UA. Pattern Matching 35 



When this rule failed, any A's which had been within the 
original value of WOED would all have been changed to Y's. 
If HORD referred to the value SASSAFRAS when the loop was 
first entered, its new value would be the string SYSSYFRYS. 

The replacement for a matched substring may be shorter 
or longer than the string it replaces. Thus one Ray writR a 
rule to replace a double vowel by a single one, as in 

WORD 'EE' = •£• 

or a single vowel by a double one, as in 

WORD 'E' = 'EE' 

While it is perfectly safe to write' the first of these 
replacement statements in a loop, so that all double (or 
triple, etc.) E's are reduced to a single R, execution of 
the statement 

SEIF WORD 'E' - 'EE' : S(SELF) 

to ffaXe all single E's into double ones will send tho 
program into an infinite loop If the value of WORD contains 
an E, Care roust always be taken when writing replacftjcnt 
statements in a loop to insure that the pattern is not 
contained within its replacement, unless some terminatina 
condition other than pattern match failure is used. 

Deletion of a matched pattern may be accomplished by 
providing a null value to the right of the assignment sign. 
Thus one may delete all E's from a string reference by 
executing a statement of the form 

DELETE WORD 'E' = NULL : S (DELETE) 

which will fail only when no E's remain within the value of 
WORD. 

The replacement rule, which is syntactically a 
combination of a pattern-matching and an assignment rule, is 
the last of the four types of rules in the Snobol languaq*^. 
If the rule part of a statement is non-null, it must call 
for either an assignment, an evaluation, a pattern match, or 
a replacement. 

_ '■ I]ie_^ii2rnation_Oj2£ratori_ The alternation operator, a 
binary operator designated by the symbol | , is used to 
specify alternatives within a pattern. The pattern-matching 
statement 



UA. Pattern Matching ^^ 



WORD »R' I 'E* : S(YES) 

specifies that the value of WORD is to be searched for 
either an A or an E, and if either is found a transfer is to 

be taken to XV.5. 

More than one alternation operator may be used within a 
pattern, as in the statement 

WORD 'A' 1 '!» 1 'I' I '0' I '0' : S(YES) 

which will succeed if the value of WORD contains any of the 
five vowels. The search for a match proceeds as follows: the 
first character of WORD is checked successively for being A, 
E, I, 0, or n: if it is none of these the second character 
is checked beginning with the A alternative, and so on. As 
soon as any one of the alternatives is found, transfer is 
made to YES, The pattern matching fails only when all 
characters of WORD have been examined and no alternative of 
the pattern has been found. 

The alternatives may consist of any number of 
characters, not just a single character as in the example 
above. One may search a line to determine whether or not it 
contains one of a number of words, where a word is defined 
as a sequence of characters surrounded by spaces, by 
eroployinq a statement of the form 

LIKE 'oAn' I 'n' W0RD1 'a* \ *n' W0RD2 'n' : S(YES) 

The values of WORDI and W0RD2 may be strings of any length. 
An alternative way of writing this pattern is used in the 
statement 

LINE 'a* ('A' 1 WORDI I W0RD2) 'n« : S(YES) 

Here, parentheses are necessary since the concatenation 
operator takes precedence over the alternation operator; if 
the parentheses were missing, the statement would be 
equivalent to 

LINE 'oA* I WORDI | W0RC2 'n' : S(YES) 

which is not what was intended. 

Tho_2attorn_?ro£:£iliiE£5_iHIiI_3E^_10IiMIILi Snobol has a 
number ~ of predefined procedures for use solely in 
contructinq patterns. The pattern procedures ANY() and 
NOTANYO provide an efficient way of expressing alternation, 
where the alternatives are single characters only. The 



UA . Pattern Matching 



37 



pattern-matching statement 

WOPD 'A* I 'S* I 'I' I '0' I 'll' 



S (YES) 



which employs four instances of the alternation operator may 
be written instead as 



or 
or 



WORD ANY(»AEIOU«) 
WORD , ANY (VOWELS) 
WOPD ANY (TRIM (INPUT) ) 



S(YES) 
S(yRS) 
S(YES) 



(if both VOWELS and TRT«(TNPOT) have the value AEIOU) . ANY () 
accepts for its single argument any expression whose value 
is a string, and returns as its value a pattern which will 
match any single character of that string. The pattern 
returned by ANYO contains only a single test for each 
character of the argument string, no matter how nany 
instances of that character the string contains. That is, 
the pattern returned, by ANY('SAGAS*) is equivalent to that 
of 'S' I 'A' I 'R' . 

The coKipanion procedure to ANY() is NOTANY() which 
returns a pattern to match any single character not 
represented in its argument. Thus 



WORD NOTANY(» AEIOUM 



S(YBS) 



will match the first character within the value of WOPD 
which is not a vowel. This match will succeed if any 
character of the complete character set, except A, E, I, 0, 
or fJ, is found. 

It is always better to use ANY () or NOTAHYO where 
single character alternatives are involved, but it will bo 
necessary to use the alternation operator for alternatives 
of more than one character. Both methods of expressing 
alternation may be used together as in the statement 

WORD 'YW' 1 'YI' I ANY (• AETOItM : S(GOOD) 

The alternation operator and pattern procedures may be 
used within renlacement rules as well as within pattern- 
matching rules. For example, the replacement rule 

WOPD ANY('AEI0U«) = "X* 

specifics that the first vowel within the value of WORD is 
to be replaced by an X; the rule 



UA. Pattern Matching 3R 



WORD NOTANY(«0123456789») = NULL 

specifics that the first non-digit is to be deleted. Fither 
rule may be written in a loop to specify that all vowels are 
to he replaced by X's 

L00F1 WORD ANYCAEICUM = 'X' * S(LOOPI) 

or that all non-digits are to be deleted 

L00F2 WORD NOTANY ( • 1 23U56789' ) = NOLL : S(L00P2) 

The Conditional Assignment Ofierator^. It is often 

imrortant~when using a'pattern which will match any one of a 
number of strings to preserve the information as to exactly 
what has been matched in the search. This may be done by 
assigning the matched substring as the value of a variable 
with the conditional assignment operator, a binary operator 
whose symbol is a period. The pattern-matching statement 

WORD ('AW' I 'AY* I ANY('AEVOU')) . SAVE : F(NO) 

specifies that the value of WORD is to be searched for the 
alternatives, and that the part of the string reference 
which satisfies the pattern is to be assigned to the 
variable SAVE. If the value of WORD does not contain any of 
these alternatives, then the match fail?? and no assignment 
takes place, i.e., the value of SAVE remains unchanged. 

(Note that these particular two-character alternatives 
must be expressed before the one-character alternatives: 
onc€ an A is found the rulo succeeds, so ai search for AY or 
AW would never be undertaken if they were not the first 
alternatives to be tried.) 

More than one conditional assignment operator may be 
used to assign the same value to more than one variable. The 
statement 

WORD ANY(«AEIODM . SAVE1 . SAVE2 . SAVE.3 : F(NO) 

assigns the first vowel within the value of WORD to the 
variables SAVE1, SAVE2, and SAVE3. 

If the variable OUTPUT is used, as in 

LINE (WORDI ) W0RD2 I W0RD3) . OUTPUT 

the successful match will be printed. The use of parentheses 
is necessary here since the conditional assignment operator 




tA. Pattern Matching 39 



associates itself with the single pattern element 
immediately to its left; if the parentheses were missing, 
OOTPtlT would be assigned a value only if the value of WORD! 
was the pattern alternative which caused the rule to 
succeed. (If that is what is intended, of course, then the 
parentheses should be omitted.) 

or is useful within 
hed pattern is to form 
vowel found is to he 
a statement of the form 

WOPD ANY('AEIO0«) . SAVE = SAVE SAVE : F(NOVOWEL) 

since the value assigned to SAVE is immediately available 
for use on the right side of the rule. If the pattern fails, 
control is sent directly to the go-to part of the statement, 
so no assignment can occur, either to SAVE or to WORD. 

goilga tenat ion _gf^ Patterns^ The concatenation operator 
can be used with operands which are patterns, as well as 
with strings. For example, in the statement 

WORD ANY{«AEIOUM 'Y* = •¥• : F (NOVOWELY) 

the operands of the concatenation operator are the pattern 
values returned by a call to the ANY() procedure and the 
string Y. The result is a pattern which will match any vowel 
which is followed by a Y; if this pattern is found it is to 
be replaced by a Y alone (i.e., the vowel is to be deietea). 
If instead the Y were to be deleted, a statement of the form 

WORD ANY('AETOU') . SAVE 'Y' = SAVE : F (VO»ELY) 

could be used. Here only a part of the matched pattern (the 
first vowel directly preceding a Y) is to be assigned to the 
variable named SAVE. Note, however, that the entire pattern 
must be found before such assignment can occur. 

It is often useful to assign the different matched 
parts of a string reference to different variables. For 
example, a pattern to search for clusters of three 
consonants, and to assign each consonant to a different 
variable, is employed in the rule 

WORD ANY (C) . CI ANY (C) . C2 ANY (C) . C3 

(It is assumed here that the value of C is a string of 
consonants.) The pattern in this rule is tlie concatenation 
of three pattern elements, each of which consists of a 



UA. Pattern Hatching 



uo 



reference to ANY() and a conditional assigunent. The three- 
consonant string may be assigned to the variable CCC as 
well, by placing the entire pattern within parentheses and 
usir.g one more conditional assignment operator, as follows: 

WOBD «ANY(C) . CI AHY(C) . C2 ANY (C) . C3) . CCC 

None of the variables will acquire a new value unless the 
entire pattern is successfully matched. 

The Immediate Assignment Operator. The" immediate 

assignment operator is a binary operator whose symbol is a 
dollar sign (?) . It is very similar to the conditional 
assignnent operator except that it causes the immediate 
assignment of any matched substring to a variable, whethpr 
the remaining elements of the pattern arc matched 
successfully or not. Thus if the rule above were rewritten 
as 

WORD (ANY(C) $ CI ANY (C) $ C2 ANY (C) . C3) . CCC 

'then CI and C2 would acquire new values each time partial 
matches occurred, but C3 and CCC would acquire new values 
only when a substring of three contiguous consonants van 
found. For example, if WORD had the value ADIRd then C1 
would acguire the value D when the match was attempted, 
while the rest of the variables remained unchanged; if WOKD 
had the value CHATEAH then CI would acquire the successive 
values C, R, and T, and C2 would acquire the value H, as 
repeated (but unsuccessful) attempts were made to find the 
pattern. Thus the immediate assignment operator may be 
useful in determining how much of a pattern was successfully 
matched before failure occurred. 

Both the conditional and immediate assignment operators 
may be applied to the same pattern element, as in the rule 

WORD ANY (VOWELS) $ SAVE1 . SAVE2 'T' 

which specifies a search for any vowel which is followed 
directly by a T. (The order in which the immediate and 
conditional assignment operators occur is immaterial.) If 
the pattern match succeeds, then both SAVRI and 55AVE2 will 
refer to the same value, that of the first vowel encountered 
which occurred directly before a T. If WORD contained one or 
more vowels, but not one occurring before a T, then the 
match will fail and the value of SAVE2 will be unchanged, 
but SAVE1 would acquire as successive values all vowels 
within the value of WORD which were encountered in the 
attempts to find the pattern. 



Hh, Pattern Matching ^^ 



The variable onTPDI may be used in con-junction with th?^ 
immediate assiqnment operator to produce a printed trace of 
the progress of the pattern-matchinq operation. For example, 
if the variable OUTPUT vere written in place of SAVE1 above, 
producing the rule 

WOKD ANY (VOWELS) $ OUTPUT . SAYE2 'T' 

and the value of WORDS was the string ECCLESIASTICAL, then 
the following output would be produced: 

E 
E 
I 
A 
I 
A 

When a trantvfer was taken to the next statement, the value 
of OUTPUT would be A and the value of SAVE2 would not have 
been changed, since the pattern match did not succeed. 

Ill£-£Ml£II!_ZE2£§iiM£S_5IlAI!il_3;M_MflA!iILi s PA m ( ) and 
BREAK!) are procedures which raatch not just a sinqlH 
character but a string of characters of indefinite length. 
SPAN() returns a pattern which matches a string composfd 
solely of the characters specified within its argum'^nt. For 
examfle, a string consisting of one or more vowels may >ip 
specified by the pattern 

SPAN('AEIOU') 

BREAK () returns a pattern which matches a string composed of 
any characters except those specified in its argument. Thus 
a string consisting of anything but vowels may be specified 
by the pattern 

BREAK ('AEIOU') 

Both SPAN and BREAK () must find a character from 
their argument strings in order to succeed. SPAN{) will 
match that character along with any other acceptable 
characters which are contigucu.^; BREAK {) will match 
everything up to such a character, leaving the "break 
character" itself unmatched. 

Note that the pattern returned by BREAK () may match the 
null value, as in 



4ft. Pattern Hatching 



U2 



WORD = 'IDLE' 
WORD EREAK{«AEIOD«) 



SAVE 



Here SAVE wi 
matches all 
case no char 
since it n 
argument. 

SPANO 
flata into 
defined as r 
of spaces, 
segment can 
word cf the 



11 be assigned the null value since BREAK () 
characters preceding the first vowel, or in this 
acters. SPAN() can never match the null value 
ust match at least one of the characters of its 



and BREAK are often used together to break 

significant units, such as words. If a word is 
string of characters terminated by any number 
periods, or commas, then the following program 

be used to assign to the variable WORD each new 

data. 



READ 
LOOP 



HUE = TRIM{TNPOT) «n» 
LINE BREAK ('a.,') . WORD SPAN ( 

[sequence of statements to process 



F(DONE) 
□ .,•) = NULL 
F(READ) 
WORD] 
(LOOP) 



Tn 
BREAK ('n 
or ccmir.a 
have b 
SPAN (»n. 
BREAK ('n 
cc!Biras w 
replaced 
WORD is 
the loop 
words re 
read in 
tri Fined 
BREAK ('n 
end of t 
characte 



the 

is en 

een ra 
,M wi 

hich in 
by th 

proces 
again 

main t 

. Not 
value 

.,♦) w 

he las 

r to m 



replace 
matches 
countered 
atched i 
11 then 
to succe 
ay be con 
e null va 
sed in so 
. The rep 
o be proc 
e that 

of eac 
ill be ab 
t word, a 
atch. 



raent statement labelle 

all characters until a spac 

. The seguence of charact 

s assigned to the varia 

match the character whi 

ed, and any other spaces, p 

tiguous. This entire pattv^r 

lue (removed from LINE) , th 

roe way, and control sent 

lacement rule fails only wh 

essed and a new value for 

a space has been concatena 

ata record to ins 

o find a "break charact 

will have at 



h d 
le t 
nd SPAN ('n. ,') 



d LOOP, 
e, period , 
era which 
ble WORD. 
ch caused 
eriods, or 
n is then 
e value of 
back into 
en no more 
LINE is 
ted to the 
are that 
er " at the 
least one 



The Pattern Procedure LENQ,. The pattern procedure 

LEN o'accepts any non-negative integer argument, and returns 
a pattern to match as many characters as its argument 
specifies. Thus LEN () matches strings of predictable length 
but unpre-dictable content, while BREAK () and SPAN() 
strings of predictable content but unpredictable length. 



match 



LEN is useful between two pattern elements to specify 
the exact number of characters which must lie between them 
for the match to succeed. Thus the search for four-character 
strings within parentheses might be specified by the 



4A. Pattern Matching 43 

stateDsent 

LINE • (• LEN(4) . INSIDE «)* :. F(OtJT) 

Note that the strings matched by the three concatenated 
pattern elements must be contiguous for the match to 
succeed. Thus the above rule does not mean "at least four 

characters hptupf»n Daronthp«;<:ie;" hut- »eiVAr-t- 1 \i ^nt^r n rf *-Uir- 

rule is successful, the first string of four characters 
found between parentheses will be assigned to the variable 
INSIDE. 

LEN() is often used at the beginning of patterns to 
match an initial field of the data, such as an 
identification numbex. The statement 

LINE LEN(IO) . IDNOHBER LIN (UO) . DATA : F(SHORT) 

assigns the first 10 characters of LINE to the variable 
IDNUMBER, and the next UO characters to the variable DATA. 
The rule will fail only if LINE contains less than SO 
characters. 

Statements of the form 

LINE LEN(IO) . IDNHfinEP 'A* : SfALtNE) 

are often erroneously used to specify a search for linos 
with A as the eleventh character. While it is true that all 
such lines will be found by the above rule, many other lines 
may be found as well. The rule will succeed if a string of 
10 characters preceding an A can be found anywhere within 
the value of LINE, not necessarily in initial position. 

2]i£_M£H£}RiI_Procedurej^ The ANCHOR () procedure may he 
used to "anchor" all searches so that they succeed only in 
initial position. In anchored mode, if a pattern does not 
match beginning with the first character of the string 
reference, failure is recorded immediately and no further 
pattern searching occurs. 

The normal, unanchored, mode of pattern matching can be 
changed to anchored mode by executing an evaluation rule of 
the form 



or 
or 



ANCHOP('ON«) 

ANCHOPCXXXM 

ANCHnR(VOWELS) 



UR. Pattern Matching 



4M 



or any other rule in which the ANCHOR () procedure is called 
with a non-null argument. Executing the sequence 

ANCHOB(«!\NCHOPITEM 

LIND LENCtO) . IDKUMBSR 'A« : S(ALTNi;) 

would cause a transfer to ALINE only when the eleventh 
character of LINE was indeed an A. 



xs 



The anchored mode remains in effect until another rulp 
executed in which the ANCHOR {) procedure is called with 



an argument having a null value, such as 
ANCHOR 
ANCHOR (NULL) 



or 



The original unanchored mode of pattern-matching is then 
restored. 

lis £attern_Procedures_TABii„aii.d„_RTABil.i "^^e pattern 
procedures ~TAB () and RTAB () specify pattern matching not m 
terms of character content or of length, but in terras of 
position within the string reference. Both TAB() and RTAB () 
accept a single argument which must be a non-negative 
integer and return a pattern to match all the characters up 
to that position within the string reference, raatching as 
always from the left. The difference between TAD() and 
RTABO is that they use opposite conventions for nuiahering 
the string positions (and thus for interpreting theic 
arguments): TAB() works in terras of numbers counted from the 
left, RTABO in terms of numbers counted from the right, as 
shown in the following charts: 



For TAB() 



r 



character: 13 6 7 

■~ " II II 

string_Eosition: Ojl |3 |6|7 

- — I ^ ^ J I J J ^ J 

C A M Y L T 

For RTABO , 

£lldILilCt£'^l. ^ ^ 3 1 

'" I I II 

string position: 7t6| 3| 1|0 

--- - Mil n Mi 

C A H Y L T 



4A. Pattern Matching 



U5 



Notice that although ther 
there is a zero-th string pos 
character or just after the la 
TABO or RTAB {) is heing used 
thinking about characters i 
positicns: TAB (2), "everythi 
matches the first two characte 
string position 1 counting fr 
characters but one. Although t 
integer to be used in counting 
imply that pattern-matching is 
matching always proceeds from 



e IS no zer 
ition — just 
st one, depend 
. This prevent 
n terms of 
ng up to str 
rs; RTAB(1), «• 
om the riaht." 

— — J - - ■ f 

he argument of 
from the righ 
done from the 

the left. 



o-th ch 
before t 
ing on 
s confus 

their 
ing posi 
everythi 

match PS 
RTAB 
t, this 

right ; 



aracter , 
he first 

whether 

ion when 

string 

tion 2," 

nq up to 

a 1 1 t h e 

is an 

does not 

pattern- 



. TAB() and PTABO may be used for 
intc fixed fields; the rule 



breaking up strings 



LINE TAB (15) . ID TAB (70) . TEXT 

assigns the first 15 characters of LINi; to in, and the next 
55 characters (those remaining up to string position 70) to 
TEXT. This is exactly equivalent to the rule 

LINE LEN(15) . ID LFN(55) . TEXT 

If the first field were of varying length, terminated 
by a space, then 



LINE BREAK ('a') . ID 



TAB(70) . TEXT 



would assign everything up to the first space to ID, and all 
characters after the space but before string position 70 to 
TEXT. Note that this is not eguivalent to 



LINE RREAK(«n') . ID 'n' LEN (70) . TEXT 

in which all characters up to the first space are assigned 
to the variable ID (as before) but a full 70 characters 
following the space are assigned to the variable TEXT. TAB() 
inay match strings of varying length ending at a definite 
string position, while LEN () will always match a definite 
number of characters ending at varying string positions. 

RTAB() can be used like TAB() for patterns in which the 
string position terminating the match is better expressed as 
a count from the right rather than from the left. PTAB(O) is 
particularly useful; it will always match everything from 
the current position in a pattern search up to the end of 
the string — the "remainder" of the string after any other 
pattern elements have been matched. 



ttA. Pattern Hatching 



Both TABO and PTAB () can match the null value; but if 
either attempts to match up to a string position to the left 
of one which has already been matched by a preceding pattern 
element, or a string position which does not exist (because 
the string is too short), the pattern natch will fail. 

The PaitPrn_Procedures_POSil_and_RPnSiLi The pattern 
procedures posl) and PPns () return patterns which match no 
characters at all (the null value); they match only tho 
single string positions specified by their single non- 
negative integer arguments. POS() uses the numbering system 
of TABO, PPOSO of BTAB(). Their use is to restrict 
successful matches by other pattern elements to certain 
positions in string references; this provides a more 
flexible form of "anchoring." 

A pattern which begins with POS (0) is anchored in the 
usual way. The rule 

LINE POS(O) •♦♦***♦' 

will succeed only if the value of LINE contains asterisks as 
its first six characters. (The advantage over turninq on the 
ANCHOR procedure is that the restriction applies to this 
single rule only.) Similarly, the rule 

LItJE EOS (7) •♦♦**♦*• 

will succeed only if the value of IINP contains asterisks as 
characters 8 through 13. 

RPOSO permits the same kind of anchoring, counting 
from the right; the rule 

LINE •♦♦♦*♦*• EPOS (0) 

will match only if the value of LINE ends with six 
asterisks, and 

LINE FOS(O) •****♦*• RPCS (0) 

will succeed only if the value of LINE is precisely a six- 
character string of asterisks. That is, the above pattern- 
matching rule is equivalent to the evaluation rule 

IDENT{LINE, t******') 

The Pattern Procedure ARRN2iIi APBNO() is the only 
pattern procedure" which accepts a pattern as its argument. 
It returns a pattern which will match zero or more 



UA. Pattern Hatching U7 



occurrences of the pattern given in its single argument. 
Note that matching zero occurrences is the same as matching 
the null value; since this is always the first choice for 
the ARBNOO procedure, a call to it always succeeds. ARBNO () 
will natch as many occurrences of tlie specified pattern as 
will cause the remainder of the pattern to succeed. 

A string is a simple form of a pattern, so the argument 
of ARBNOO may be a single character or characters, h 
pattern to match zero or more A's may be specified as 

ARDNO(»A«) 

This differs from 

SPAN('A«) 

in that the SPAK() procedure must always match at least one 
character, so the pattern which is tlie value of SPAN('a«) 
matches one or ipore A's instead. 

A pattern which will match any number of charactpi-s, 
including none, enclosed within parentheses (rather than 
exactly a, or seme other number) can be specified with the 
use of ARRNOO as follows: 

LIIIE • (• ARDNO (LEN(1)) . INSIDE •) • : F(NOPAREfn 
This pattern will match strings of the form 


(1) 
(AB) 
(XXX) 

The null value or the characters within the parentheses will 
be assigned to the variable INSIDE, 

A more complicated illustration of the qse of ARBNO () 
is provided by a consideration of the following set 'of 
sentences: 

The dog ran. 

The old dog ran. 

The old, gray dog ran. 

fhe old, gray, barking dog tan. 

The siirildrity among these sentences may be characterized in 
terms of some pattern which would succeed when applied to 
any of them. Such a pattern may. be written with the use of 



UA, Pattern Matching '^^ 



AEBNCO as follows: 

•THEa' ARBNO(BRERK('n,«) LEN(1)) 'DOGnRAN.' 

Whpn this pattern is applied to the first cnntencc, the 
ARBKOO procedure matches zero instances of its argument, or 
the null va]ue, since the literal strings withxn the pattern 
account for the entire sentence. In the second sentence. 
ARBNCO matches one instance of its pattern, the string 
OLDn. In the third sentence, ARBNOO matches three instances 
of its pattern, the string OLD.nGRAYa. This is three 
instances since BREAK () first matches everything up to the 
comma, then up to the space following the comma, then up to 
the space following GRAY. In the last sentence, ARBNO() 
matches five instances of its pattern, the string 
OID,DGRAY,DBARKINGn. The pattern matching m the last 
sentence occurs as follows: 

(1) the opening literal matches to begin with and 
AFBNOO matches no instances of its pattern {or the null 
value) ; but then the closing literal cannot be matched, so 
an instance of the ARB!JO() pattern is sought with 

(2) BRllAK matching everything up to the comma (the 
string OLD), and LEN () matching the comnia; when the final 
literal cannot be matched, successive instances of tne 
ARBNOO pattern arc tried with 

(3) BREAK matching everything up to the blank (the 
null value) and LEN () matching the blank, then 

(U) BREAK matching everything up to the next comma 
(the string GRAY) and IEN() matching the comma, then 

(5) BREAK matching everything up to the following 
blank (again the null value) while LEN() matches the blank, 
and finally 

(6) BREAK matching everything up to the next blank 
(the string BARKING) and LEN () matching the blank. At this 
point the final literal can be matched and the entire 
pattern matching is completed. 

These successive attempts by ARBNOO to match the 
numher of instancos of its argument which will cause i ho 
remainder of the pattern to succeed could be observed by 
using the immediate assignment operator m conjunction with 
the variable OUTPUT as described earlier. 



4A. Pattern Matching 



49 



Ass ign ing /Patterns to Variables^ Patterns may be 

assigned as the values of variables just as strings are 
assigned as the values of variables. This may be done with 
an assignment rule of the usual form, such as 



PAT = 'lOO' 



or 



or 



DOG = 'THEa* ARBNO (BREAK ( 'a, ♦ ) LEN(1)) 'DOGnRAN.* 

The variable which refers to the pattern rather than 
the pattern itself, may then be used within the pattern part 
of a rule as in 



or 
or 



VOWELS PAT 
LINE ID. PAT 
DOGLINE DOG 



S(YES) 
F (SHORT) 
F(WODOG) 



Rhen these statements are executed, the current values 
of PAT, ID. PAT, and DOG are obtained; thus the pattern 
matching and the conditional assignment are perforned 
exactly as if the patterns themselves were expressed. 



The value of th 
but it may be use! a 
rule, as indicated a 
since a string is a 
ID. PAT and DOG are 
concatenations of va 
patterns. Any expres 
procedure, an al 
immediate assignment 
operator (described 
The values of such e 
special variable 00 
(Ways of printing th 
Pattern are indica 
"PROTOTYPE ".) The 
in no way rcstric 
values, but may be a 
parts of the program, 



e variable PA 
s the pattern 
t the very be 
trivial form 
of datatype 
lues of calls 
sion containi 
ternation op 

operator, 
below) , has a 
xpressions ca 
TPrjT, since o 
e value of a 
ted in Appe 
variables ID. 
ted to havi 
ssignod value 



T is of datatype String, 

part of a pa ttern- matchina 

ginning of this chapter, 

of a pattern. The values of 

Pattern, since thoy arf» 

to procedures which return 

ng a reference to a pattern 

erator, a conditional or 

or a deferred evaluation 

value of datatype Pattern. 

nnot be assigned to the 

nly strings can be printed. 

n expression of datatype 

ndix A, section II. B, s.v. 

PAT and DOG are of course 

ng only Patterns as their 

s of any datatype in other 



If a pattern occurs within a rule which is to he 
executed more than cnce, or if the same pattern occurs in 
more than one rule, a considerable increase in program 
efficiency can be obtained by assigning the pattern as the 
value of a variable. The use of a variable within the rule 



UA. Pattern Matching 



sn 



makes it unnecessary to construct the pattern every tine tho 
rule is executed. 



When a pattern is assigned to a variable, as in th^ 



rule 



ALTPAT = X I Y 



and 



above) 



any variables occurring within the pattern (X 
are evaluated when the assignment rule is executed. Thus if 
X had as its value the string A and Y the string B, th^ 
value of ALTPAT after the above rule had been executed would 
be eguivalent to 'A' | 'B' . 

There are often applications, however, in which one 
wants the variables of the pattern to be evaluated only when 
the pattern is used in a pattern-matching rule, not when th^ 
assignment occurs. For example, a loop to search the value 
of WORD for one of two substrings, each to be read from tho 
input file, may be written as follows: 



LO0P1 X = TRin{INFUT) 
Y = TRIM (INPUT) 
WORD X 1 Y 



F(DONR) 
F(ERROr.) 
S (FOUND) 



F(L00t>1> 



Since the efficiency of the program can be increased by 

using a variable which refers to a pattern, rather than the 

pattern itself, one would like to be able to write the loop 
as 



ALTPAT = X I Y 
L00P2 X = TRIW (INPUT) 
Y = TRIM (INPUT) 
HORD ALT'PAT 



F(DONE) 
F (ERROR) 
S (FOUND) 



F(L00P2) 



If this is done, however, the loop will not have the same 
meaning as before. The new values of X and Y which are 
acquired from the input file on each iteration of the loop 
will not affect the value of ALTPAT; rather its value will 
remain unchanged at »A« l 'B' (if A and B were the values of 
X and Y when the assignment occurred) . 

The Deferred Evaluation 0£erator5_ The deferred 

evaluation operator, a unary operator whose symbol is an 
asterisk (*) , may be used within patterns to take care of 
the above situation. It may be written directly bofore tho 
name of a variable to indicate that its evaluation is to ho 
deterred until its value is needed during a pattern-ma tchinq 
operation. For instance, the assignment rule 



4A. Pattern Matching 



51 



ALTPAT = ♦X I *Y 

may be used to indicate that both X and Y are variables 
which are to bs re-evaluated each time a pattern-matching 
rule is ex-<^cuteQ in which ALTPAT is used within the pattern 
part. Thus the sequence 



STTDliT ~ 



*Y 



L0CP3 X ^ TRIM (INPUT) 
Y = TRIM (INPUT) 
WOLD ALTPAT 

vfill produce the same results as the 
but more efficiently. 



F(DONE) 
F (ERROR) 
S (FOUND) 



F(L00P3) 



L00P1 example above. 



The unary * operator is also useful in patterns in 
which the value of cne pattern element is dependent on the 
succassful match of an earlier clement of the same pattern. 
Consider, for example, the problem of searching a word to 
detorinine whether or not it contains two identical 
contiguous vowels. This pattern may be expressed using the * 
operator as 

V0W2PAT = ANY(VOWFLS) $ V *V 

!?hen this pattern is used, as in the statement 



WOPD V0W2PAT 






it specifies a search through the value of WORD for any of 
the five vowels, immediate assignment of the vowel foun-l to 
the variable V, and then a search of the next character for 
another instance of that same vbvrel. 

A more general pattern in the same vein is one which 
searches for two identical contiguous characters. This may 
be expressed as 



CHARPAT = LEN(1) $ CHAR ♦CHAR 

and works as described above. Without the use of deferred 
evaluation, these patterns would be cumbersome to define. 

The unary * operator may he used only before names of 
variables, not bofore referfMires to pattern procedure;-;. An 
expression composed of a deferred evaluation operator and a 
variable name is of datatype Pattern and so may be used only 
where a pattern value is appropriate: hence such an 
expression may not be used as the argument of any of the 
pattern procedures except AKBN0(). The loop 



I4A. Pattern Matching ^2 



ARDPAT = 'S* AKBNO(*X) . SAVE • S« 
lOCFU X = TRIM(INFtlT) : F{DONE) 

WORD ARBPAT : S(FODND) F(LOOPU) 

specifies a search through WORD for zoro or more instances 
of whatever string is specified on the next data record, 
bounded by an S en either side, and the assignment of thp 
substring matched by ARBNO() to the variable SAVE, If the 
search fails, another data record is read, causing a 
different pattern to be sought, 

The_S£ecial_^attern_Variables_.ARB_and M^s. There are 

six variable" which "have" predefined patterns as their 
values, assigned by the Snobol system; these are the only 
six variables in Snobol which do not have the null value 
when execution of a program . begins. The values of these 
variables may be changed in a program by assigning thera new 
values in the usual way, but then of course the predefined 
valuf^s are lost. The six special pattern variables are AHB, 
REn, EAL, FAIL, FENCE, and ABORT. Only ARB and REH will be 
discussed here. (The remaining four pattern variables are 
described in Appendix B.) 

The variable APB has as its predefined value a pattern 
equivalent to ARBNO (T,EN (1) ) — that most arbitrary pattern 
which will match the null value or any string of characters. 
ARB, like ARB>J0(LEN(1)) , matches the longest string of 
characters left for it by surrounding pattern elements; thus 
the pattern to match any parenthesized string could have 
been written as 

LINE •(' AT^B . INSIDE •)' : F(NOPAREN) 

Execution of this statement would cause the variable INSIDE 
to fc€ assigned the zero or more characters occurring between 
a pair of parentheses. 

The variable REM has as its predefined value a pattern 
which will match "all the remaining (none-or-more) 
characters." Another pattern equivalent to this is RTAD(0>. 
For example, a statement to match all characters after the 
sixth may be written as 

LINE LEN(6) REM . A6 : F(NOTSIX) 

Execution of thin statement will cause LFN(r.) to match the 
first six characters in LINE and will cause all remaining 
characters to be assigned to the variable Ar^. If; t:he value 
of LINE is exactly six characters long, the pattern ra.-\tch 
will succeed and the variable A6 will be assigned the null 



UA. Pattern Hatching 53 



value. If the value of LINE is less than six characters long 
the pattern match will fail, A6 will not acguire a new value 
and control will be sent to the statement labelled NOTSIX. 

Since the predefined pattern values of both ARB and REM 
are equivalent to patterns which may easily be written in 
other ways, APB and RE?1 may be regarded merely as convenient 
predefined abbreviations for longer pattern specifications, 

A Proarat3_to illustra te P attern --^atching.^ The program 
text provided below reads an indefinitely long text which 
has line numbers in the first six positions of each data 
record, and words occurring in free form, but never broken 
across records, in the remaining positions. A word is 
defined as a string of characters followed by a space or a 
punctuation character. Any number of spaces and/or 
punctuation characters may occur between words (and before 
the first word on a card). The program looks for vords 
within the text which begin and end with the same character 
(one letter words excluded) . If such words are found, thoy 
are printed following the line number of the record in which 
they occurred. Thus the two records 

000001 EFFICIENCY IS IMPORTANT BUT 

000002 ELEHANCR IS TO BE DESIREC 

would produce the output 

000002 ELEGANCE DESIRED 

since the first line contains no words which begin and end 
with the same character, but the second line contains two. 
All patterns are assigned to variables for the sake of 
efficiency. 

* FBOGRAH TO FIND AND PRINT ALL WORDS THAT 

* BEGIN AND END WITH THE SAME CHARACTERS 

* 

* SET UP THE PATTERNS NEEDED FOR THE PROGRAM 
* 

PUNC = 'a.,:;' 

WORD. PAT = BREAK (PUNC) . WORD SPAN (PUNC) 
ID. PAT = LEN (6) . ID (SPAN (PUNC) | NULL) 
SAME. PAT = POS(O) LEN(1) $ CH RTAB(1) *CH 

* READ THE NEXT RECORD OF THE DATA - APPEND A SPACE 
GETLINE LINE = TRIM (INPUT) 'a' : F(END) 

* 

* REMOVE ID NUMBER - IGNORE RECORDS SHORTER THAN f> CHARS 

LINE ID. PAT = NULL ■ : F (GETLINE) 



UR. Pattern Hatching ^^ 



* GfT THE NEXT WORD - IF NO HOPE WORDS, CONSIDER PRINTING 
GETHORD LINE WORD. PAT = NULL : F(PRINT) 

* 

* SEE IP THIS WORD HAS SAME FIRST AND LAST CHARS - IF NOT, 

* THEN GET THE NEXT WORD 

WORD SAME. PAT : F (GETWORD) 

* 

* WORD TO BE PRINTED - APPEND IT TO THE OUTPUT LINE 

OUT = OUT 'nana' WORD : (GETWORD) 
* 

* PRINT VALUE OF OUT IF IT CONTATKS ANY WORDS 

* PRECEDE THE WORDS EY THF APPROPRIATE LINE NUMBER 
PRINT OUTPUT = DIFFER (OUT, NULL^ ID OUT : F(GETLINE) 
* 

* IF NECESSARY, ASSIGN OUT A NULL VALUE BEFORE PROCEEDING 

OUT = NULL : (GETLINE) 

END 



55 



5A. INDIRECT REFERENCING 

The fact, that a single variable may be used to refer to 
a nuniber of .different values durinq the course of program 
execution makes it possible to write a general rule which 
can have the effect of many specific ones. For example, the 
single rule 

OUTPUT = HORD 

specifies in general that the current value of the variable 
named WORD is to be printed, whatever that value may be. If 
the above rule is part of a loop in which HORD is being 
assigned a new value every time the loop is entered, then 
the rule sends different specific characters to the output 
file every time it is executed. Without this ability to 
express a process in general terms rather than in specific 
ones, no useful programs could be written. 

The ability to generalize is further extended in Snobol 
by the use of indirect referencing. This operation allows 
one to specify a variable without writing its name into the 
program text; rather, one specifies a variable by writing an 
expression whose value is a variable, just as WORD in the 
rule above may refer to a number cf different values durinnr 
the course of program execution, so this expression 
involving indirect referencing may refer to a number of 
different variables during the course of the program, each 
variable's value changing independently. In neither case do 
the specific values need to be known when the program text 
is written. Hence the use of indirect referencing allows 
another level of generality to be introduced. 

The Indirec t Refe rencin g Op erator. Indirect referencing 
is accomplished by means of the indirect referencing 
operator, a unary operator whose symbol is a dollar sign 
{$) . This operator takes a single string-valued operand (or 
one of datatype Name as described in Chapter 7) and returns 
as its value the variable named by that string. In the 
simplest case, the operand is a literal as in the rule 

OUTPUT = $«WORD« 

which produces the same effect as 

OUTPUT = WORD 

Both will cause the current value cf the variable WORD to be 
prirted since the variable returned by the $ operator above 
is the one whose name is WORD. There is no advantage to 



5A. Indirect Referencinq. ^^ 



using the $ operator in this way, since it is simpler to 
write WORD than to write $«WORB«. 

However, there are many variables which cannot he 
referred to by writinq their names in program texts sincr- 
they consist of strings of characters which are not 
identifiers. As indicated in Chapter 2, 

1 RHYME ..VOWFLS TEXT/3 P-V-C 

are all the names of variables, but they are not valid 
representations of these variables within a program text. 
These variables may be represented with the use of the $ 
operator, since they are, respectively, the values of the 
expressions 

IMRHYME' $«..V0WE1S' $'TEXT/3' $'P-V-C« 

Although these expressions are useful in a way that $«waRn» 
is not, they introduce no generality into the program since 
each specifies a single, fixed, variable. 

Generality is introduced when the operand of the $ 
operator is some string-valued expression other then a 
literal. Thus the rule 

OUTPTIT = SWORD 

can cause the values of different variables to be printed 

when it is executed at different times, since the variable 

whose value is to be printed depends on the current value of 
WORD. If the rules 

WORD = 'SASSAPHAS' 

and 

SASSAFRAS = 'TREE' 

have been executed, then execution of the rule 

OUTPUT = SWORD 

will cause the characters TREE to fce printed. First WORD is 
evaluated to yield the string SASSAFRAS; then the $ operator 
returns the variable named by that string. Thus the effect 
is as though 

output: = $»SASSAFPAS« 
or, equivalently. 



5a. Indirect Referencing, S7 

OUTPUT = SASSAFRAS 

had been executed. 

Similarly, the rule 

$V0WI:L = SVOWEL + 1 

can cause the value of nany different variables to be 
incremented by 1. If the value of VOWPli, is the string A, 
then the rule is equivalent to 

$'A' = $'A« ♦ 1 
or 

A = A ♦ 1 

but if the value of VOHEL is a different vowel, say E for 
example, then the rule is equivalent to 

E = E ♦ 1 

instead. Thus executing the same rule at different times in 

the program may result in incranentinq the value of 

different variables. A single rule of this form could be 
used to count how many of each vowel occurred in a text. 

(Notice that a variable returned by the indirect 
referencing operator is treated in the execution of rules 
exactly like a variable whose name is written in the program 
text; variables occurring to the right of an assignment 
sign, or within a pattern or a string reference, must bo 
evaluated when the rule in which they occur is executed.) 

Uie-Qperand ,of , the In direct_Rgf erejicing Op erator . The 

operand of an indirect referencing operator may be an 
expression of any complexity; the only restriction is that 
this expression yield a non-null string (or a Name) when it 
is evaluated. Thus the operand of a $ operator may itself 
contain one or more $ operators (as in the expression 
JSCriRPENT) , as long as the variable returned by each inner $ 
operator refers to a value which is a string. These nested $ 
operators, like nested procedure calls, must be evaluated 
frcm the inside out since the variable returned by an inner 
$ is needed to form the operand of an outer $. For example, 
if the assignments 

CflRRKNT = 'VOHEL* 
and 

VOWEL = 'A* 



C Q 

5A. Indirect Referencing. 



have been executec^., then the rule 

$$CIIREENT = $$CURRENT ♦ 1 
is equivalent to 

A = ft * 1 

The evaluation of the rule involving double indirect 
referencing proceeds as follows: first the value of CURRENT 
is determined, providing the string VO'-^EL as the operand of 
the inner $ operator and making the expression $$CURHri^T 
eguivalent to $$'VOWEL'; when the inner $ is applxed to the 
string VOViEL the variable VOHEI. is returned, making 
$$«VOMEL« equivalent to $VOMEL; the cuter $ is then applied, 
giving $'A', in turn equivalent to A, as above. Examples of 
how multiple indirect referencing can be useful are provided 
by two program texts given at the end of this chapter. 



Similarly, a reference to any procedure which returns a 
string as its value may be used within the operand. As a 
simtle example, the rule 

$STZE(WORD) = .$STZE(WORC) ^ 1 
could be used in a loop, analogously tc the rule 
$VOWEL = JVOWEL ♦ 1 

above, to count how many words of each length occurred in a 
text Tf the current value of WORD at some point during 
execution is the nine-character string SASSAFRAS, then the 
above rule is equivalent to 

jigt = ji gi + 1 

Thus the variable whose name is 1 would be assigned the 
count of the one-character words, the variable named 2 the 
count of the two-character words, etc. Although the names of 
the<=e variables may not be written in the program text, the 
variables may be specified by means of indirect referencing, 
since the $ operator may be applied to any string of 
characters to return the variable named by that string. 

The null value may not be used as the operand of the $ 
operator since the name of a variable must be at le^st on^ 
character long. It is a common mistake, however, to n^e as 
the operand of the $ operator a variable which at some time 
during the course of execution will have a null value. ?uch 
an error cannot occur in the example above, since there is 



5A, Indirect Referencing. 59 



no way for the operand to be null. Tf WORD has a null value , 
then SIZE(WOFD) returns the integer zero as its value. Hence 
the count of all null values is referred to by the variable 
whose name is 0. (If WORD has a value which is nOt a string, 
then an execution-time error will result when the SIZF() 
procedure is called, before an attempt to apply the $ 
operator can be made.) 

ii_I!lI23r a m_to_Produce_a_Char a ct2r_Countj^ As an example 
of the power of indirect referencing, consider this simple 
character-counting program, which prints out a table giving 
the number of times each letter occurred within a text. 

* PBOGRAH TO MAKE A CFiAFACTER COUNT 

* SET UP CHAT^ACTER-FINDING PATTERN 

CHAR. PAT = LEN(1) . CHAR 

♦ 

* REAP IN THE DATA 

READ LINE = TRIM (INPUT) : F (OUT) 

♦ 

* FIND THE NEXT CHARACTER - ASSIGN XT TO THE VARIABLE CHAR 
LOOPI LINE CHAR. PAT = NULL : F(READ) 

♦ 

* ADD ONE TO THE COUNT FOR THAT CHARACTER 

INC $CHAR = .^CHAR + ^ : (LOOPI) 

* 

* SPECIFY THE flLPHABFT FOR HECOVEFTNG COUNTS 
OUT ALPHA = • ABCDEFGMIJKLHNOPQRSTUVWXyZ* 

* 

* GET THE NEXT LETTER WHOSE COUNT TS TO BE RECOVERED 

* ASSIGN IT TC THE VARIABLE CHAR 

L00P2 ALPHA CHAR. PAT = NULL : F{END) 

♦ 

* IF LETTER DID NOT OCCUR, GIVE IT THE VALUE ZERO, NOT NULL 

$CHAR = IDENT($CHAR,NULI) 

* 

* PRINT LETTER AND ITS COUNT 

OUTPUT = CHAR 'anna* $CHAR : (L00P2) 
END 

Output from this program would be a list of the form 



A 


129 


n 


Sfl 


c 


32 



and so on. 

This program uses the pattern which is the value of 
CHAP. PAT to assign each successive character of the text to 



5/i. Indirect Referencing. 



60 



the variable CHAR; indirect referencing is then use.^ to 
retarn the variable named by that character. Depending on 
which character has been found, the rule part of the 
statement labelled INC might be equivalent to 



or 
or 



A = A ♦ 1 
B = B + 1 
$»,« = $«,» + 1 



or whatever. 

When all the text has been read, printing of the counts 
begins. This is done with the use of the variable ALPHA, 
whose value is a string containing all the characters for 
which counts are to be printed, given in the desired order. 
(Tn this case, only letters have been chosen.) These 
letters, one by one, are again assigned to the variable CHAR 
(although any other variable would have done as well) by 
means of the CHAR.P^'^ pattern. Using indirect referencing, 
the variable named by the character is tested to deterr.ine 
whether or not it has a null value; if it is null, then that 
character was never encountered in the text and so the 
variable is given the value zero for output purposes. The 
output statement prints the value of CHAR (the character .'\ 
the first time the output loop is entered) and the value of 
$CHSR (in this case the value of the variable A, or 129). 

This scheme for specifying the printing permits the 
programmer to choose the order of the output — alphabetical 
order, rather than text order — and to be selective; the 
program causes counts to be' stored for all characters 
(nuirbers, punctuation, spaces, etc.) , but only the counts 
for the letters are recovered for printing. 

* 

COTicatcnation_within the Operand. The concatenation 

operator" is "needed within the operand of the indirect 
referencing operator in applications in which variables 
having "successive" names are to be used. "For example, 
execution of a loop of the form 

NLCCP N = N.+ 1 

OllTPtlT = TRIM(TNPOT) : F(ALLGONE) 

$(»LTST' N) = OUTPUT : (NLOOP) 

ALLGONF, 

will cause an entire group of data to be read, printed, and 
stored, with r,uccesslve records being assigned as the values 
of the variables named LISTI, LIST?,..., $(»LTST« N). When 



5A. Indirect Referencing. 61 



the loop terminates through failure of INPUT, the value of N 
is an integer one greater than the nuniber of lines of flata 
which have been read. Since these lines of data arc now 
stored in the memory they may be processed in some way, for 
exaniplG subjected to pattern-^raatching and replacement, and 
eventually printed out again in an altered form. The 
following loop may be used to print out all the lines, 
reversing their line numbers in the output, so that the last 
record read in is numbered 1, the next-to-last numbered 2, 
etc., until the first record read in is numbered K-1 » 

M = N 
HLOOP M = GT(M,1) M - 1 : F (DONE) 

OIITPOT = N - H 'nana' l('LIST« f1) : (MLOOP) 
CONE 

In the above example, a single set of successively- 
named variables were being assigned values (those whose 
names all begin with the characters LIST). This process can 
be made more general if several sets of successively-named 
variables are assigned values by the same program segment. 
If, for example, a file contained intermixed records of 
various types, each type distinguished by the first 
character of the record, then the following segmeot of 
program text, would cause each record to be assigned to tlie 
variable named by the concatenation of its first chciracter 
(the type-code) and the number of records of that typp 
encountered so far. 

READ RECORD = TRIM (INPUT) : F(DONE) 

* 

♦ CETERHINE TYPE-CODE OF RECORD 

RECORD LEN(1) . CODE : P(READ) 

♦ ADD ONE TO COUNT FOP THIS TYPE 

$CODE = $CODE + 1 

4c 

♦ STORE RECORD IN NEXT "SUCCESSIVE" VARIABLE OF ITS TYPE 

$(CODE $CODE) = RECORD : (READ) 
DONE 

The first record found beginning wifch an E would become 
the value of the variable named El, for example, and the 
twenty-fifth record found beginning with a colon wrnild 
become the value of the variable named :2'i. If the distinct 
type-codes are stored by the program as they are 
encountered, then the records have effectively been sorted 
in terms of their first characters, since the records of 
each type can now be found as the values of different r.t^tr, 
of successively-named variables. 



5R. Indirect Heferencinq. ^2 



Variables having "saccessive" names are also useful in 
printing data in tabular format, where a varying number of 
cpaces, or other characters such as dots or dashes, will be 
needed to make the data line up properly. The variable named 
1n, for example, could be assigned tho value of a single 
space, while the variable named 2n would have the value of 
two spaces, etc. In general, variables can be given names 
which indicate their values, where the first part of the 
name indicates the number of instances of some character, 
and the second part indicates the character in question. 
Thus the variable named 52X would have as its value a string 
of 52 X's. 

The short segment of program text below causes such 
variables to be assigned appropriate values. The value of 
MAX is the largest number to be used as the first part of 
any name and is the maximum length of any string to be 
assigned as value; the value of CHAB is the particular 
character to be used as the second part of each name and is 
the character of which all string values are to be composed. 

POPMLOOP N - IT (N, MAX) N ♦ 1 : F(DONR) 

$(N CRAR) = $(N - 1 CHAR) CHAP : (FORKLOOP) 
DONF 

If WAX has the value 10 and CHAH has the value of a 
single dash, then execution of the loop causes the set of 
variables named 1-,2-, . . . , 10- to be assigned the respective 
values -, — ?•••# • 

A program may begin by executing the FORHLOOP segment 
repeatedly for each pair of values of CHAR and MAX needed to 
generate the strings which may be required for formatting 
within the remainder cf the program. Then whenever, say, a 
string of U? spaces is needed it may be represented by the 
expression $(42 'n'), and whenever 10 periods are needed 
+hey may be represented by the expression S(10 . ), 
provided the FORMLOOP segment has been executed when the 
value of MAX was at least U2 and the value of CHAR was a 
space, and when the value cf MAX was at least 10 and the 
value cf CHAR was a period. If an expression of this form is 
written in which the numeric part lies outside the range 
specified (from 1 to the value of MAX) when the set of 
variables involved was given value, or in which the 
character part is not a character which was the value of 
CHAR when the ''OR^ICCP segment was executed, then the null 
value is likely to result; a variable will always ho 
returned from an expression of this form, but not 
necessarily one to which a value has been assigned. 



5A. Indirect Peferencing. 6 3 



Concatenation within the operand is also useful as a 
safeguard against conflicts which occur when a variable 
returned by the $ operator turns out unexpectedly to be the 
same as one written directly in the program text as an 
identifier, and used for some unrelated purpose. In the 
character-counting example above, the writing of any one- 
character name within the program text would have produced a 

r'nnfT-ii-f nf nt;anr» if H-Kaf r^V, :» r^r-t- ar- U ^ A /-»*••■/- n .-r-/->,-1 ui-t-k^n 4-1, ^ 

text heing processed. In that particular case, only 
variables with one-character names could be returned so the 
restriction could be made that no one-character names be 
written in the program text. Often, however, thf»re is no way 
of knowing which variables will be returned by indirect 
referencing. Consider the case of countinq words, rather 
than characters, in a text; if the same scheme is employed, 
then each word of the text will be used as the name of a 
variable, and there is often no restriction on which words 
may occur, so a conflict in the use of variables is likely. 

Such conflicts may be avoided by using concatenation 
within the operand of the $ operator to produce a strinq 
which is not an identifier; then the variable returned hy 
applying the $ operator to this string will necessarily he 
one whose name can never be written in the program text. 
This has been done in the formatting example above by always 
usirg a number as the first part of the name, so these namor> 
are never in identifier form. Siirilarly, if the expression 
$(•*• CHAT?) were used in place of SCHAP throughout the 
character-counting program text above, the restriction 
against the use of one-character names within the program 
text could be removed; the number of A's in the text would 
then be referred to by the variable named ♦^f the number of 
3's by *B, etc. The two complete program texts wh:.ch follov; 
in this chapter both rely on concatenation of this form to 
insure against the possibility of error due to conflict. 

A Program to Produce_a_FreguencY__Table^ The usefulness 
of multiple indirect referencing is illustrated in the 
following program, which is similar to the character- 
counting program but produces instead a frequency table 
specifying how itany letters failed to occur in the text, how 
many occurred once, how many twice, etc. The program begins 
in the same way as the character-counting program, by using 
a variable named by a character to refer to the number of 
times that character occurred within the text, when all tho 
text has been read in, the character counts themselves are 
used as the operands of the $ operator to return variables 
whose names are 0, 1, 2, . . , ,etc. ; the values of these 
variables are increased by one for each character which 
occurred that many times within the text. 



5k. Indirect Fefecencing. 



6U 



C 
conf li 
contai 
text 
nameu 
refer 
seccnd 
variah 
charac 
Since 
indica 
table 
would 
error 
is use 
part 
occurr 
whose 



oncatena 
ct of 
ned any 
containe 
3 would 

to the 

part, w 
le name 
ters whi 

t he V a 
ting the 

for 3 
appear t 
might be 
d to ret 
of the 
i n g 3 t i 
name is 



tion i 
variafc 
digits 
d, fo 



numb 
hen th 
d 3 
ch occ 
riable 

numhe 

occur 
o run 

an ab 
urn a 

progr 
mes ca 
simply 



s used in 
le usage 
. If conca 
r example 
d in the f 
er of 3's 
e frequenc 
would be 
urred exac 

named 3 w 
r of 3«s 
rences wo 
correctly 
normally h 
variable w 
am; the 
n then saf 

3. 



this example to pr 
which would occur i 

tenation were not us 
some 3's, then th 

irst part of the P 
occurring in the te 

y table was being fo 

used to refer to the 



tly three times in 
ould then already ha 
in the text, the 
uld be incorrect. (T 
and the only indi 
iqh count.) Thus con 
hose name is 3* for 
frequency table for 
ely be made with a 



ev en t the 
f the text 
ed and the 
e variable 
rogram to 
xt ; in the 
rmed, the 

number of 
the text. 
ve a value 

frequency 
he program 
cation of 
catenation 
the first 
characters 
variable 



♦ PROGRAM TO HAKE A FEEQUENCY TABIE 
♦ 

CHAR. FAT = IEN(1) . CHAR 
READ LINE = TRIM (INPUT) : F (CHAPS) 

L00P1 LINE CHAR. PAT = NOLL : F(READ) 

$(CHAR •*•) = $(CHAR '♦') ♦ 1 : (LOOPI) 
* 

* SPECIFY THE CHARACTERS WHOSE FREQTJENCIES APE TO BE FOrJND 
CHARS ALPHA = • ABCDEFGHIJKLMNOPQRSTOVWXYZ' 

L00P2 ALPHA CHAR. PAT = NULL : F (PRINT) 



41 



GIVE MAX THE VALOE OF THE LARGEST COUNT SO FAR FOUND 
MAX = GT($(CHAR •*')f«AX) $ (CHAR •*•) 
* 

* CHANGE ANY NULL VALUE TO ZERO 

$(CHAR •*•) = IDENT($(CHAR •*')#NULL) 
* 

* USE DOUBLE INDIRECT REFERENCING TO HAKE A COUNT OF COUNTS 
FREC $$(CHAR •*•) = $$(CHAR '*') ♦ 1 : (L00P2) 

* 

* PRINT THE FREQUENCY TABLE 
PRINT COUNT = 

♦ 

* IF NO LETTERS OCCURRED COUNT TIMES, SKIP IT 
L00P3 IDENT($COUNT,NULL) : S(SKIP) 

OUTPUT = $COUNT ' qLETTERSdOCCURREDu • COUNT 'nTIMES* 
♦ 

* INCREASE THE VALUE OF COUNT UNTIL THE MAXIMUM IS PEACHED 
SKIP COUNT = LT(CCUNT,HAX) COUNT * 1 : S(L00P3) 

END 



5A. Indirect Beferencing. 6^ 



Output from this program would be of the form 

2 LETTFRS OCCUPRED TIMES 

U LETTERS OCCOB'^ED 1 TIMES 

2 LETTERS OCCaRiiED U TIMES 

7 LETTERS OCCrTRPED 6 TIMES 

and so on. Such a table would have at most 26 entries; all 
26 would be present only if each letter had a different 
character count associated with it. 

The statement labelled FREO uses double indirect 
referencing to form variables from these character counts. 
Its rule represents assignments of the form 

$•0' = $»0« * 1 
$• 1' = $• 1« ♦ 1 
.'S«2» = $«2« + 1 

The value assigned to each of these variables is increased 
by one every time a character is found which occurred that 
many times in the text. 

(Note that it is necessary to assign the value zero 
rather than the null value to variables representina 
characters which did not appear in the text. If this wer^? 
not done, the rule part of the statement labelled FREQ would 
attempt to represent a rule of the form 

$• « = $» • ♦ 1 

if the value of $(CHAR •*») was null, and an execution-time 
error would result.) 

A Pro gra m to Produce a WordCount^ As a further example 
of the use of both multiple indirect referencing and 
concatenation, consider the following word-counting program 
which works on the same principle as the character-counting 
program; it uses each word as the name of a variable and 
increases the value of that variable by one whenever the 
word occurs within the text. The process of printing out the 
words once the counts have been formed, however, is 
necessarily more complicated than that of printing a 
character count. While it is possible to specify all the 
characters which may occur in a text, it is seldom possible 
to specify all the words. If counts are desired for only 
certain words, then a list of those words can be supplied as 
data to the program; but if all words are to be counted, or 
all words except those specified, tlien some record must be 
kept by the program of all different words encountered so 



5A. Inclirect Referencing. 



6f. 



they may be retrieved. In this program, concatenation is 
used to asr.ign each new word to a variable whose name is of 
the form W/1, H/2, W/3, etc., so that all words of the text 
may be recovered for printing with the use of those 
"successive" variables. 

♦ FFOGRAM TO MftKE A WORD COUNT 

* SET UP WORD-FINDING PATTERN 

PUNC = 'a.,:;' 
HOPD.PAT = EREAK(POKC) 
♦ 

* PIAD TEXT AND FIND WORDS 
READ LINE ^ TRIM (INPUT) 'n* 
L00P1 LINE WORD. PAT = NIILL 

♦ CSE CONCATENATION IN FORMING THE WORD COUNT 

$(s*« WORD) = $('*• WORD) ♦ 1 



WOPD SPAN (PUNC) 



F(OUT) 
F{READ) 



* 
* 
* 



TEST TO SEE WHE-^HER THIS IS A NEW WORD 
I? NOT, RETURN TO lOOPl 

EQ($(«*» W0RD),1) : F(LOOPI) 



♦ NEW WORD - ASSIGN IT TO A VARIABLE NAMED W/1, W/2 , ETC. 

N = N + 1 

$(iy/i N) = WOPD : (LOOPl) 

* JSLL DATA HAS BEEN RFAD IN - PPINT WORD COUNT TABLE 
OUT M = IT(1,N) M ♦ 1 : F(END) 

OUTPUT =-- $('W/« «) 'nnan' $(•*• $('W/' H) ) 

: (OUT) 

END 

The words are printed in the order of their first 
occurrence in the text. Output for a well-known six-word 
text would be 



TO 
BE 
OP 
NOT 



2 
2 
1 



1 



In the processing of this short text, the rule 
$(•*• WORD) = $('*' WORD) + 1 
at different times is equivalent tc rules of the forn 



5A. Indirect Referencing. 67 



$»*T0« = $»*T0' ♦ 1 

$'*BE' = $'*RE« + 1 

$'*0R» = $«*CR« ♦ 1 

$'*NOT' = $»*NOT' + 1 

and the like, while the rule 

$(»w/« N) = WORD 

is equivalent to 

$•'7/1 • = 'TO* 

$'W/2« = 'BE' 

$'W/3» = 'OB* 

$»W/I4» = »NOT« 

When the first line of the output is printed, the 
output statement 

OUTPtJT = $(•«/' M) »nnnD' $(•♦• ${»M/» M) ) 

is equivalent to 

OUTPUT = $«W/1« •anan» $(•*• $'W/1«) 
or 

OOTPm = J»W/1' 'aana' $»*T0« 
or 

OttTinrtni — t m/-\ it 

Indirec t Heferencingwithin the Go-to. The indirect 

referencing operator may be used within the go-to part of a 
statement as well as within the rule. When the $ operator is 
used within the go-to, it takes the string which is its 
operand and returns the label which is that string. Thus the 
go-to's 



and 



(S'FEAD') 
(PEAD) 



have the identical effect of causing a transfer to be taken 
to the statement labelled READ, 

(Note that the $ operator must appear inside the 
parentheses rather than outside, since the only characters 
which may appear betw€>en the cclon and the open parenthesis 
of the go-to are an S or an F. Thus the go-to : $('lREAr)») 
is syntactically incorrect. Inner parentheses, such as 
: (J('n7AD» N) ) are permissible.) 



5A. Indirect Referencing. ^^ 



As be^^ore, the power of indirect referencing becomes 
visible only when the operand consists of something besides 
a literal. The statement 

LINE lEN(G) . COPE : 5($C0DE) 

illustrates the usefulness of the $ operator within the go- 
to It causes the first six characters in the value of LINE, 
if'there ar? that many, to be assigned to the variable CODE, 
and then, on success, transfers to the label specified by 
those six caaracters. (The value of CODE which was obtained 
in the rule part of the statement is immediately available 
for use within the go-to.) The single general go-to 
(JCODE) may thus represent a great many specific go-to s, 
one for each possible value of CODE. These values -rfhich CODE 
may acquire must all be in identifier form, since an 
individual label must actually exist within the-program for 
every possible transfer which is taken. (The indirect 
referencing operator may not be used in the label field, so 
there is no way of using a label which is not an 
identifier.) If an attempt is made to transfer to a non- 
existent label, an execution-time error will result. 

If the special variable INPUT occurs within a go-to in 
which an indirect referencing operator is used, as in 

^Q(X,Y) • S($(TPIM (INPUT))) 

it is assigned as value the next data record, since this 
string value is needed as the operand of the $ operator. If 
the next data record had the characters NOUN as its first 
four characters, followed by spaces, the go- to shown above 
would send control to the statement labelled NOUN if the 
rule preceding the go-to succeeded. If INPUT fails, or any 
other failure occurs in a go-to, then an execution-time 
error results, since no information will be available as to 
which statement is to be executed next. 

Concatenation is often used within the go-to to send 
control to "successive" labels of the program. For example, 
the statement 

N = SIZE (WORD) : ($CR"I>L" N)) 

assigns to N the integer length of the value of WORD, and 
then transfers control to a label specified by concatenatinq 
the characters RULE and this integer; if WORD has as itr. 
value any one-character string, a transfer would be taken to 
the statement labelled RULE1; if WORD has as value a two- 
character string, then control would be sent to RULE2, etc. 



5A. Indirect. Referencinq. 69 



(The statements starting at RHLEl would ptesumably specify 
sonie process to be perforiBQd on one-character words, which 
would be different from the process at RULE? for two- 
character words, etc.) The same effect could be achieved by 
writing 

: (${«RULE' SIZE (WORD))) 

Ncte that some device such as the concatenation of an 
alphabetic literal is necessary in the above example, since 
one iray not write simply 



or 



($N) 
(?SIZE(WORD) ) 



These go-to' s would send control to labels of the form 1, 2, 
3, etc., and such labels do not exist since they may not be 
written in the program. Indirect referencing within the go- 
to is ofton useful, but is more limited than indirect 
referencing within the rule: the string designating a label 
must always he in identifier form and a corresponding label 
must exist in the program toict in order for the transfer to 
be taken; o^^ the other hand, the string dcsignatinq the name 
of a variable ma/ be composed of any characters, since any 
string names a variable, and there is no need for that 
variable to have been used in any prior statement of the 
program. 



70 



6A, prcgpan:if,p-dffinid ppocedurhs 



I 

proccd 

progra 

per IT it. 

expres 

degree" 

proced 

calls 

many o 

predof 

the ne 

progra 

ref ere 

Proqra 

( a 1! d i 
re flee 

progra 



n additi 
ures, S 
mner to 
s the 
sed as a 
s of c 
ure. The 
to simpl 
f these 
j.ned pr 
cessarv 
ai to per 
nee to t 
m texts 
ncidenta 
ts the 
m. 



on to supplying a nuniber of useful predefined 
nohol provides a mechanism which allows a 
define any procGdurc of his own choosing. This 
task which a program is to perform to be 
series of separate processes of varying 
omplexity, each of which is defined as a 

more complex procedures may consist mainly of 
er procedures which have been defined earlier; 
procedures, in turn, will make use of the 
ocedures supplied by the Snobol system. Once 
procedures have been written, the writing of a 
form some task is simplified since it can make 
he highest-level, most powerful procedures. 
written in this fashion are easier to write 
lly easier to read) because their organiratiou 

structure of the process embodied in the 



tef inin3_a_Procedurei. A definition of a new procedure 
requires tvo parts: first, the name of the procedure being 
defined and the form of future references to that procedure 
must be declar-^d to the Snobol system; second, a detjcr ipt ion 
(in '^nobol) of what the procedure is to do must be provided, 
which will be executed each time the procedure is called. 

The declaration of a programmer-defined procedure is 
acccmplisheri by executing a predefined procedure, DFFIN'^O, 
which in its simplest form has a single argument consisting 
of a string which is a sample reference to the procedure. 
For instance 



DEFINE(«REPEAT(N,OBJFCT) ') 

declares a new procedure, RSPFA-^O, which is defined to have 

two arguments, represented by the names N and on.TECT. The 
description of what the REPEAT () procedure is to do can 

anything expressible in Snobol. If its purpose is 

concatenate some obiect to itself n times, this might 
expressed as follows. 



REPEAT 



N = RT(N,0) N - 1 
REPEAT = REPEAT OBJECT 



F (RFTORN) 
(REPEATS 



be 

to 
be 



This section of program text, termed a "procedurf;^ 
body," is written in accordance with a number of conventionr, 
which are the subject of the following sections of this 
chapter. It is identified as the procedure body for the 
REPfATO procedure by the label REPEAT, which has the 



sam'* 



6A. Programmer-defined Procedure; 



71 



form as the r.afrse cf the proceduro. The nanes .■"; and OBJECT 
are used both in the declaration and in the procedure body 
to represent the two arqiiments with which the REPEAT {) 
procedure vjill fee called. The value of N indicates how many 
times the value of OBJECT is to be concatenated to itself to 
form the value to be returned by the REPEAT () procedure. 



The fi 
that the 
still qreat 
the value 
REPEAT, in 
decremented 
desired num 
failure tra 
fixed locat 
Snobol sys 
call to th 
returns as 
REPEAT (aga 
procedure) 



rst statement of the pr 
value of N is to be dec 
er than zero; the second 

of OBJECT is to be cone 
itially null, every ti 

When the value of 
ber of concatenatioiis hav 
nsfer to RETURN is taken; 
ion in the program, but r 
tern to return to whatever 
e REPEAT {) procedure. 
its value the current val 
in vjith the Sdne form 
when the transfer to RETU 



ocedure body s 
ceraented by one 
statement specif 
atenated to the 
me N is succ 
^J becomes zero, 
e been performed 

this represents 
ather a request 

statement conta 
The REPEAT p 
ue of the variab 

as the name 
M is taken. 



pocifies 

if it is 

ies that 

valiie of 

essf ully 

t h en t h e 

and the 

not any 

to ♦■ h e 

ined the 

rocedure 

1g named 

of the 



Once the REPEAT () procedure has been declared and a 
procedure body provided for it, then it way be invoke^l by a 
procedure reference anywhere in the program text. Eor 
instance, one night write the assignment rule 

OUTPUT = RBPEAT(10,«XM 

to specify that a string of 1C X»s is to be printed. 



The REPEAT procedure provid 
producing the varying length str 
than the scheme involving indirect 
Chapter 5. Here it is not necess 
set of successively-named variable 
in order to insure that a string o 
available; rather the needed stri 
procedure call. Using REPEAT () , 
data group may be printed in a two 
the first record of a pair is pr 
and the second 5;tarting in a colum 
with a sufficient number of the 
is the value of CH printed in betw 
sogtrent may be us(»d for iUat jM)ri>o 



LOOP 



REC1 

R EC ? = 

OUTPrJT ■-■ 



TRIf^ (INFU'^) 
TRIM (INPUT) 
REC1 REPEAT 



(N- 



es 

i n g s 
ref 
ary 
s in 

E th 

ng 

the 

-col 

inte 

n wh 

form 

een. 

so. 



1) 



a s i m p 1 

needed 
eiencing 
to store 
advance 
e right 
is gene 
alternat 
u:nn form 
d start! 
ich is t 
att-ing c 
The fol 



er meth 

for form 

descr ib 

values 

of thei 

length w 

rated b 

e record 

at, such 

m in CO 

he value 

ha racter 

lowing p 



od of 

a 1 1 i n g 

ed i n 

w i t li a 

r use 

ill be 

y the 

s of a 

that 

lunn 1 

of N , 

which 

rogra m 



; F(nONE) 

: E (ERROR) 
STZr, (REC1) ,C!1) 
(LOOP) 



PEC 2 



6a. Pcogr.amner-d»^f ined Proceou tt^s 72 



Since patterns may be concatenated to one another an 
well as strings, the RcPEA.TO procedure may take a pattern 
as its second arqument and will then return a pattern as its 
value. For example, the pattern- matching rule 

WORD HEPEAT(:^,RN7(VCWELS)) : S(YEf^3) 

Hill succeed and send control to YESl if the value of WORD 
contains? at least three contiguous vowels. 

Procedure names may be defined more than onco in a 
program and even the names of predefined procedures may be 
redefined (although there is seldom any reason for doing 
so). In each case, it is the most recent definition which 
establishes the current meaning of the procedure name, and 
any preceding definition is lost. 

Thp DFFTNEj;) Iiocedure^ The predefined procedure 

DEFINE will accept two arguments, both strings. The basic 
form cf the first argument consists of the name of the 
procedure being defined followed by a par entliesir.ed list of 
names of "formal variables" (or "dunmy variables") which are 
used in the procedure body to represent the arguments wil-h 
which the procedure will be called; m the example abovo, 
DEFINE ('REPEAT (N, OBJECT) •) / the procedure REPEAT() is 
declared with the two formal variables N and OBJKCT. 

Procedure names and names of formal variables may be 
freely invented by the programmer, subiect to the usuaJ. 
restriction that they be identifiers. They may be the saoie 
as names used elsewhere in the program text for other 
purposes, because all the names in the first argument of the 
DEFINE procedure are used 'in a special way: when a 
procedure is called, these nawes are all made to refer to 
new variables, "internal" to the procedure call, which are 
distinct from the variables to which the names previously 
referred; they will continue to refer to these internal 
variables until a return from the procedure call is made. 
(This mechanism will be described in detail in follnwiiig 
sections of this chapter.) It turns out to be useful to have 
other names which are made to refer to internal variables 
for the duration of each procedure call; these names of 
additional internal variables, if used, are written 
immediately following the closing parenthesis of t h» formal 
variable list. A definition of a PRINT() iiporedure, wliich 
has three additiori.il internal variables, CDUld bo 

DEFINE ( TBI NT (N, NAME) n,W,P') 

The internal variables t1, W, and P could then be us^mI within 



6A. Programmer-defined Procedures 73 



the procedure body vhere they might be assigned some values, 
such as tallies, needed only during execution of the 
procedure call. Notice that the list of additional internal 
variables is an extension of the string which is the first 
argument; no erabedded blanks are permitted in this string. 
There is no limit to the number of formal variables and 
additional internal variables with which a procedure may he 
declared. 

It is also possible to declare a procedure with no 
formal variables, as in 



LI L.1: J. n f, ( ' h tv- u n uo ' ) 

if the process which the procedure is to perform is not 

dependent on an argument list. The BECORDS() procedure, for 

example, might be used to count all records in a group of 

(lata read from the input file. Even though there is no 
argument, the pair of empty parentheses must still appear, 

both in the declaration and in every reference to the 
procedure in a program text. 

The second argument of the DEFTNE() procedure is a 
string which is the label of a statement in the procedure 
body which is to be executed first whenever the procedure is 
called; this label is termed the "entry label." If the 
second argument is null or missing (and thus null bv 
default), as it has been in all previous examples, the entry 
label is taken to have the same form as the procedure name. 
Thus the declaration 

DEFINE {'BECORDS () ' ,» FECORDSM 

would have precisely the same effect as the preceding 
example, of defining the entry label to be RECORDS. 

Kore commonly, the second argument of DEFINE() is used 
to insure that the entry label for a procedure body is 
different from any label which may happen to appear 
elsewhere in the program text, since all the labels of a 
program must be unigue. Thus the convention may he adopted 
of forming all entry labels by preceding the name of the 
procedure with the string PR.; the evaluation rule 

DEFINE ('RECORDS () • , ' PR. RECORDS*) 

deplares that the entry label for RECORDS () is the label 
PR. RECORDS, and the first statement to be executed in the 
procedure body for the RECORDS {) procc-dure m;:st hear that 
label. (The labels of the other statements of a procodure 



6A. Proguamn-er-def ined Procedures '^ 



body should also be prot^^ected from conflicts by adoptinq 
seme similar conventions.) 

The DEI' INFO procedure itself returns the null value 
when it is executed. 

Pl2££t^ilLfl_B2'^ieSi A DEFINE procedure declares to the 
Snobol system "the name of a programmer-defined procedure, 
the names of its foriral variables, additional internal 
variables, and its entry label, hut gives no indication of 
its effect; that information is supplied by a procedure 
body, which consists of a series of 5>nobol statements to be 
executed whenever the procedure is invoked. A procedure body 
may consist of any number of Snobol statenents, one of which 
(not necessarily the first) must have the label declared by 
thP" DEFINE as the entry label for this procedure. The 
statements of a procedure body may be of any kind: they may 
include procedure declarations and references to other 
procedures, or even to the procedure being defined. A 
'-rocGdi'ire whose body contains a reference to itself is 
termed a "recursive procedure"; examples of recursive 
procedures may be found in Chapter 8„ 

The statements of a procedure body should be executed 
only in response to a procedure call, so procedure bodies 
should bo located within a Snobol program text in such a way 
an to be outside the flow of control of the "main program"; 
the main program consists of all statements except those of 
procedure bodies. 

The specification of a procedure's action is made 
general rather than specific by using the names of the 
formal variables within the ' procedure body. In the 
definition of the COUNT () procedure shown below, the formal 
variables PAT and LIUE are used to represent the many 
different arguments with which this procedure may be called 
on different occasions, 

DEFINE (•COrJNT(PAT, LINE) •, 'PH. COUNT') : (END. COUNT) 
PR. COUNT LINE PAT = NULL : F(SETUPN) 

COUNT = COUNT ♦ 1 : (PR. COUNT) 

END. COUNT 

The first statement of the procedure body specifies 
that the value of the second argument LINE is to be searched 
for an instance of the first argument PAT; the second 
statement of the procedure body increments the value of 
COUNT each time a pattern is found and sends control back to 
the first statement to institute another search. COUNT () is 
thus generally defined as a procedure which counts the 



6A. Prograromer-def ined Procedures 7S 



number of occurrences of some pattern within some strinq; 
infcrntation as to what pattern and what string are to be 
used will be supplied to the procedure body by the arguments 
each time the procedure is called. (Notice hov the procedure 
body has been removed from the flow of control of the inain 
program by the unconditional transfer following its DEFINE() 
statement. ) 

The internal variable named COUNT, rather than any 
other variable, is assigned the result because ot a 
convention which exists for the returning of values; when a 
success return from a procedure is taken, the last value 
assigned wi^-.hin the procedure body to the variable whose 
name is the same as that of the procedure is returned as the 
value of the procedure call. If that variable, which is 
termed the "result variable," is assigned no value during 
the execution of the procedure body, the null value is 
returned. A value of any datatype may be returned as the 
value cf a procedure call. 

The...Returns RETHRN, NRETTIT^N, and FRETfJPN. The logical 
end of ?» procedure body is signalled by a go-to specifying a 
transfer to RET'IRN (the standard success return) , to NHF.TORN 
(another success return, for returning a variable rather 
than a value), or to FRETURN (the failure return). Thene 
tranfrfers have special system definitions and constitute 
requests to the Snobol system to return control to th-"" 
statement from which the procedure was called. Any number of 
statements in a procedure body may contain transfers !-o 
RETL'RN, NFETHRN, or FPETURN; the first such transfer to be 
executed ends execution of the procedure call. If either 
success return (RETURN or NRETTIRK) is executed, the value of 
the result variable is returned as the value of th^ 
procedure call and execution of the calling statement 
resumes at the point of the call; if the failure return 
FRETORN is executed, no value is returned but control is 
sent directly to the go-to of the calling statement where 
the failure transfer will be taken. 

There is no restriction against using RETURN, NRE'^URN, 
or FRETURN as the label of any statement within the program 
text, but if this is done the special system definition of 
that return is lost. Hence RETURN, NPETURN, and FPF'^nRN must 
not be used as labels within any program whicli employs them 
to return from a programmer-defined procedure, or else a 
transfer to RETURN, for example, from a procedure body will 
send control not to the calling statement but to the 
statement labelled RETURN. 



6A. Prcgrammer-defined procedures 76 



The example below presents anct.her way to write the 
COUNT procedure, in which the procedure body includes both 
RETUFN and rRETOFN transfers. (An example of a procedure 
which uses NEETUPN may be found toward the end of thi?: 
chapter.) As before, the procedure is d,-siguod to count the 
number of occurrences of some pattern within some strinq; 
here, however, if no instances of the pattern are found, the 
procedure does an FRETnRN, causing failure of the rule from 
which it was called, rather than returning the null value. 

DEFINE ('COUNT(PRT, LINE) •, 'PR. COUNT') : (END. COUNT) 
PP. COUNT LINE PRT = NULL : F (OUT. COUNT) 

COUNT = CCUNT + 1 : (PR. COUNT) 

OUT. COUNT IDENT (COUNT, NULL) : S (FRETURN) F (RETURN) 
END. COUNT 

As in the earlier definition of COUNT (), the counting 
loof is executed until the pattern match fails. When this 
happens, however, control is sent to the statement labelled 
OUT. COUNT which tests COUNT to see whether or not it has 
been incremented. If it has not — if the pattern match 
failed on the first attempt — then COUNT has a null value, 
the test will succeed, and the procedure will do an FRf^"!'Ui'.N 
causing failure of the procedure call; if COUUT is non-null, 
then the procedure will do a RETURN, returning the value of 
COUNT as the value of the procedure call. Often, as here, a 
success transfer may lead to an FRETURN, and a failure 
transfer to a RETURN. 



Plocedure_CallSj^ When an assignment statement such as 

NUMDERA = COUNT ('A', RECORD) : F(NONE) 

is executed, the procedure call must be processed before the 
assignment can take place; hence, execution of the calling 
statement is temporarily suspended while the Snobol system 
executes the procedure call. 

To carry out the call, the Snobol system beginr; by 
taking several automatic actions. First the names in the 
first argument of the DEFINE () statem'^nt are made to ref^r 
to new variables which are internal to this call o^ the 
procedure. The procedure name now reff^rs to the internal. 
result variable, and the formal variable names refor to 
internal formal variables. Next the internal variablos to 
which these na;,ir-s now rpfer are assigned the values net^ded 
for the execution of this call: the result variable (COUNT 
in this case) is assigned the null value, the lormal 
variables arc assigned the values of their corresponding 
arguments (in this example, the formal variable PAT is 



6A. Prcg rammer-defined Procedures 77 



assigned the character A and the foraal variable LINE is 
assigned the value of the variable PECORD) . Since there is 
no way to make reference to a variable except by using its 
name, this means that the variables formerly referred to by 
the names COUNT, PAT, and LINE are inaccessible during the 
execution of this procedure call. 

After this preparation is completed, control is sent to 
the entry label and execution of the procedure body begins. 
The action of the procedure is carried out using the values 
of the arguments provided to the procedure call, since these 
have iust been assigned as the values of the formal 
variables. The statements of the procedure body are executed 
in the usual way, until a request for the system to do a 
return is encountered. 

Any return automatically reverses the actions of the 
preparation process; the names of the procedure and of the 
forr:al variables are made to refer to the same variables 
which they named -just before the procedure call I'as 
executed, and thus the internal variables, having served 
their purpose, become in turn inaccessible. The flow of 
control reverts to the calling statement — on a RFTURN, to 
the point of the procedure call; on an FRETtlRN, to the go- 
to. 

The Passing of Arguments^ When a procedure is invoked, 
the values of thp arguments in the procedure reference are 
said to be "passed" to become the values of the formal 
variables. The values of the arguments are assigned to th^ 
corresponding formal variables on a one-to-one, left-to- 
riaht basis. Any procedure, predefined or programmer- 
defined, may be called with more or fewer arguments then its 
definition provides for. Missing arguments are taken to havp 
the null value; extra arguments are evaluated before the 
procedure call is executed, but are otherwise ignored. 

In Snobol, all arguments are passed "by value"; that 
is, the arguments are evaluated and the resulting values are 
passed to the procedure body. (In fact, the mechanism for 
passing arguments has the same effect as if a Snobol 
assignment rule wore executed, with the formal variable on 
the left side and the argument on the right.) This method of 
passing arguments assures that the values of variables in 
the arguments are not affected by execution of the procedur" 
call. For instance, in the call 

NOMBERA = COUNT (• A* , RErORD : F(NONE) 

it is the value of the variable record which is passed as 



6A. Programmer-defined Procedures 



78 



the value of the second argument. The procedure vill use, 
not the variable R2C0PD, but only the internal formal 
variable LINE vhich has been assigned the value of RECORD at 
the time of the call. Thus the value of HECORn is always the 
same before and after a call of the COUNT (} procedure is 
executed. 

The arquments used in a procedure reference may be any 
expressions having values which the procedure body will 
handle properly. A call to CO0NT() such as in the statement 

NUHEERV = COllNT(ANy ('AEIOO') rRI^CORD) : F (NONE) 

would pass the pattern returned as the value of the 
procedure call ANY (' AETOO' ) to be the value of the variable 
PAT. Since PA*^ is used in the pattern part of a statement, a 
pattern valup is appropriate and the number of vowels within 
the value of RECORD will be returned as the value of this 
call to the COUNT procedure. 

While ^he first formal variable, PAT, may acquire 
either a string or a pattern value, the second formal 
variable, LINE, may acquire only a string as value, since it 
is used within the procedure body as a string reference. 
Execution of a procedure call of the fern 

NIJHBERV = COl]NT(RECORD,ANY('AEIOU') ) : F (NONE) 

(in which the programmer has presumably forgotten the 
correct order of the arguments) will pass the formal 
variable LINE a pattern value; when the procedure body is 
entered an execution-time error will result, since the first 
field in a replacement rule cannot be a pattern. 

M^iiioMl-lLtSlMl-Illi^lllSSA The names of variables 
which~are to be internal to a procedure call (in addition to 
the result variable and any formal variables) are also made 
to refer to distinct internal variables at each procedure 
call, thus making the variables previously referred to by 
those names temporarily inaccessible; the names are restored 
to their former significance when a return from the 
procedure call is taken. The internal variables which they 
name are initially null at every call of the procedure just 
like the result variable. There are thus two possible 
reasons for declaring additional internal variables: to 
prevent their names from conflicting with names used 
elsewhere for other purposes, and to take advantage of the 
automatic null initialization at each call. Any number of 
additional internal variables nay be declared by writinq 
their names in the first argument of a DEFINE () procedure. 



6R. Proqrainraer-(^ef ined Procedures 7^ 



As an example of the usefulness of additional internal 
variables, consider the LONGER {) procedure which ernploys 
four of them. This procedure compares the two strings given 
as the values of its first two arquments to determine which 
contains the longer sequence of the chacaclers specified by 
the value of its third argument; it returns as its value the 
string containing the longer sequence. If the size of the 

convention the first string is returned as the value of th<^ 
procedure call; if neither string contains a character given 
by the third argument, a transfer to FRET'IRN is taken 
causing failure of the procedure call. Thus execution of the 
assignment statement 

OUTPdT = LONGER ('HILftEIOUS' ,'TREACHER0r!5« ,' AEIOTJM 
+ : F{NOVOWEL) 

would cause the string HILARIOUS to be printed f^iuce its 
longest vowel sequence is longer than any vowel sequence in 
the string TREACHEROO?. 

DEFINE {♦LONGER (S1,S2,SEQ) T 1 ,T2, SA VE ,LONGEST ' , 

* 'PP.LONGFH*) : (END.LONGFR) 

* HAKE COPIES OF THE TWO STRINGS TO BE COMPARED 
PR. LONGER T1 = Si 

T2 = S2 

* FIND THE LONGEST SEQUENCE IN THE FIRST STRING 

* ASSIGN ITS SIZE TO TT!E INTERN M. VARIABLE NAMED LONGEST 
T1. LONGER "^1 SPAN(SEQ) . SAVE = NULL : F {T2 . LOrJGT^ P) 

LONGEST = GT (SIZE(SAVE) , LONGEST) SI7E(SAVE) 
+ : (Tl. LONGER) 

* SEE IF THERE IS A SEQUENCE IN THE SECOND STRING 

* WHICH IS LONGER THAN THE LONGEST SEQ IN THE 1ST STRING 

* IF SO, ASSIGN THE SECOND STRING AS THE VALUE 0? THE 

* RESULT VARIABLE AND RETURN 

T2. LONGER T2 SPAN (SEQ) . SAVE - NULL : F (OUT. LONGER) 
LONGER = GT(SIZE(SAVE) , LONGEST) S2 

* : S (RETURN) F(T2. LONGER) 

* IF NO SEQUENCE WAS FOUND IN EITHER STRING, FAIL 

* OTHERWISE RETURN THE FIRST STRING AS VALUE OF THE CALL 
OUT. LONGER LONGER = DIFFER (SAVE , NULL) SI 

* : S (RETURN) F(FRETURN) 
END. LONGER 

This procedure uses four additional internal variables 
named T1, T2, SAVE, and LONGEST. T1 and T2 are needed 
because the method used for determining the longest vowoi 
sequence in S1 and S2 deletes each voweL r;oquence which is 
found. Since the original strings must be presorted to be 
returned as the value of the procedure call, the replacement 



6A. Programiner-deeine(3 Procedures 



BO 



statements TLTONGEe and f 2. LONGER use the variables T1 and 
T2 rather than Si and 52, allowing the values of SI and S? 
to remain unchanged. The internal variable SAVE is assigned 
each vowel sequence which is found. The fact that SAVE is 
given the null value initially allows the test in the 
statement labelled OUT. LONGER to deteimine whether or not 
any vowel sequences have been found; if SAVE still has its 
null value, then neither string contains a vowel and an 
FREIUT^N is t-aV.cn. The internal variable LONGEST is used to 
keep track of the size of the currently longest vowel 
sequence as each is successively found within the first 
string. When the determination of the size of the longest 
sequence has been completed, this number is then compared 
with fehe size of each vowel sequence as it is found in the 
second strinq until either a longer sequence is found (in 
which case the second string is returned as the value of the 
procedure call) or until all vowel sequences have been 
considered (in which case either the first strina is 
returned or failure is signalled). 

Since in this procedure body the internal variables T1 
and T2 arc assigned the values of the arguments as soon as 
the procedure body is entered, the only reason tor declaring 
them to be internal is to prevent conflicts with other uses 
of the names T1 and T2. The internal variables SAVE and 
LONGEST are similarly protected, but also take advantage of 
the fact that they are initialized to null each time the 
LONGER () procedure is called. 

Note that the use of the additional internal variable 
LONGEST is not really necessary since the result variable 
LONGER may be substituted for it wherever it occurs. Result 
variables have exactly the properties of additional internal 
variables until a success transfer is taken, so they are 
often assigned teir.porary values which are needed during the 
processing of a procedure call. When the final value of a 
call has been determined, it can then be assigned to the 
result variable and a return raade to the statement in which 
the procedure call occurred. 

5ff2L£B.£I£5_l2_IilJ:§IMl_Iaii§i]3i£5i '^^^' principle of a 
programmer-defined procedure" is that of a "sub-program," 
independent of the program with which it is used; it 
receives vcilues throuqh its arguments, performs some process 
using those values, and returns the result. If temporary 
values are needed, the procedure assigns them to additional 
internal variables, so that it avoids changing the values of 
any variables not internal to itself, i.e., those whose 
names do not appear within the first argument of the 
DEFINE statement for the procedure. 



6A. Pccyrammer-def iiied Erocedtires B1 



P-roct^Anres written in such a way as to Rake reference 
to no values other than those of their internal variables 
(or to literals within their own bodxes) , and which assiqn 
values only to their own internal variables, are desirable 
for many reasons. They are easy tc move from program to 
prcgram since they will operate correctly regardless of 
their environment, and they are easy to use because they can 
influence that environment only through the result which 
they return (including, of course, the possible "result" of 
failing) . 

At the same time, there are sometimes good reasons for 
relaxing this discipline, in pursuit of the same goals for 
which procedures are written in the first place: to make 
programs easier to write and clearer to read. One example of 
such a motivation has already come up in some of the 
examples; in the procedure body for the LONGER {) procedure, 
for example, the statement 

T1.LCNGFR T1 SPAN {• AEIOU' ) . SAVE = NULL : F(T2.L0NGKR) 

occurs. Here NULL is the name cf a variable which is 
external to the call of the LONGER {) procedure; since the 
name NI'LL is not included in its declaration, it receives no 
special treatment when this procedure is called; it 
continues to refer to the same variable before, during, and 
after a call to LCN(5ER(). Thus, if LONGER () were to he 
called from a program which had assigned some non-null value 
to the variable named NULL, it would not work as intended. 

Tn this case there are several ways to restore the 
independence of the LONGER () procedure; the identifier NULL 
can be replaced in its body by a literal null string (tvo 
adjacent quotation marks), or by nothing, or the name NULL 
can te declared as naming an additional internal variable 
for LONGER 0, thus assuring that NULL will refer to a 
variable initialized tc the null value each time LONGER() is 
called. For this procedure such precautions seem extreme, 
but they mioht make sense if LONGER () were a much more 
complicated procedure, and were intended for use by people 
ether than its frograromer. 

As another motivation for making reference to external 
variables, consider a programmer-defined test procedure 
which determines whether or not the string given as its 
argument is a palindrome, that is, whether it reads the same 
frcm left to right as from right to left. The complete 
program presented below uses the PALIN() procedure to 
perform this test. The program reads all trimmed records of 
a group of data hut prints only. those which are palindromes. 



6A. Prcgranmer-clef inod Procerlures 



* PALTNDROilE-FIlxDING PROr;RAK 
* 

* SET UP PATTERN NEFDI^D BY THF, PMTN() PROC?DURE 

* ASSIGN IT TO A MAIN-PROGRAM VARIABLE 

PAL. PAT ^ POS{0\ LEH{1) J CH RTAB(1) . CAND *CH 
DEFINE{'PALTN(CAND)CH','PR.»ALTN') : (END.PALIN) 

4c 

* IP CANDIDATE NOW CONSISTS OF 1 OR CHARACTERS, SUCCEED 

* OTHERWISE APPLY THE PATTERN AGAIN 

PS.PAIIN LE(SIZE(ChND),1) : S(RETnRN) 

CAND PAL. PAT : S(PR.PALIN) F(FRETURN) 

END.PAIIN 

* 

READ RECORD = TRI!^ (INPUT) • : F (END) 

PRINT' OUTPUT = PALIN (RECORD) RECORD : (READ) 

END 

Output from this program could be strings of the form 

HANNAH 

I 

FCTCR 

a 4 « 

KOCN 

SAGAS 

* 

103595301 

YREKAEAKERY 

><> <><> <>< 

The PALIN procedure uses virtually the same pattern 
as that shown at the end of Chapter H for finding words with 
identical first and last characters; the patt3rn is changed 
only by the re-assigntnent of the substring matched by 
RTAE(1) to the variable naned CAND. Thus, on each iteration 
of the loop the string being searched is shortened by the 
loss of its first and last characterr;; a new set of first 
and last characters is then tested for identity. The loop is 
executed until either (1) the end characters being tested 
are found to be different, upon which an FRETURN is taken 
signifying that the string is not a palindrome, or (2) the 
size of the string is reduced to zero or one, in which case 
a RETURN is taken since this indicates that all characters 
f^avp been tested and that the string is a palindroire. Not^ 
that the rule in the statement labelled PR. PALIN will 
succeed iraraediatelv if tlie size of the argument is eithor 
zero or one, meaning that strings of one or no characters 
are palindromes by definition. The PALIMO procedure returns 
the null value en r.ucces:>, since the result variable PALIN 
is not assigned a value within the procedure body. 



6A. Frogranmer-def ined Procedures 



Si 



Hero the pattern on which PAT.TN() relies is coustrncterl 
once, in the stateraent junt above tha DEFINF () , and ar;si.qne'1 
to the variable PAL. PAT. The reason for doing this is clear: 
since internal variables are intern'il to a single call of a 
procedure and their values never perr^ict b-t'.feon calls, if 
PAL. PAT were declared to be the name of an additional 
internal variable of PALIN{) then the pattern assignment 
would have to be moved into the procedure body, and thus the 
pattern would have to he constructed anew at each call of 
the PALI"N(j procedure — a substantial arnoan*- o*- unnecessary 
effort. 

It is trne that PALIN'() viil rot work properly if the 
prograiii caxiing it iriad ver t.an'Lly assifjus ci diff(-;rerit value 
to the variable PAL. FAT. n uiqht peen, tliat this kind of 
error could be avoided by rewriting PALIN () to accept tiie 
pattern as another argument, ratiier ^-.han merely using the 
value of an external variable; but that turns out not to be 
truG. A call, to such a re-written FALIN() procedure would bo 
soirething like 

PALTH(POS(n) LKN(1) $ CH PTAB(1) ■. CA^'^ *CH , Rl'Cn'Jl)) 

Apart fto:; t hc^ bot-her of wriring f hr> invariant pr«t<C'';ii > ti 
ev<M:y icif'Tence to f.'A),T'J(), fh^ naiterii is oricf^ ar,ii.n heniq 
ccnrtructed at eacli call of' PALTN() — in the evaliMtjon ot 
the argument, rather tlian within Hie nroCv='iUre boiy. The 
calling program can avoid the repeated evaluation of the 
pattern by executing the assignment statement 

PAL. PAT = POS{0) LEN{1) $ CH R'^AB{1) . CA ND *Cfi 

and then making references to the procedure in the form 



PALIH (PAL, PAT, eECORD) 



F(NOPALIin 



But now, just as before, the calling program is responsible 
for assuring that PAL. FAT has the correct value at the time 
of the call,. So tiie original PALTNf) procedure cannot be 
improved upon in this way, and has the additional n^rit of 
requiring only one argntnent instead of two. The conclusion 
tc be drawn is that a pattern used by a procedure must 
eitlier be censtructod at eacli procedure call, or else must 
be assigned as th^ value ol: an extornal variablf^ so that it 
will be aval Id 111' lor us<> by r<>pr>ai-i'd [jrcx-cd ure fMlli;. 



t he va 1 ui' 
ass i'ln iimnt 



(it 
t- o 



Notic', hou<-vov, liow thi' i).itt(in whirrh i 
the raai n- |)i;oin am varia!)].*- FAl. PAV can caiuif 
fhe intornal fdrmal variable named CANH and to M,o 
adrlitional intr^tnal variabl'^ naned CH within the PAI.TtM) 



6A. Programmer-defined Procedures 



8U 



procedure. The pattern PAL. PAT calls tor iinmediate 
a<?siqninent to whatever variable is currently referred to by 
the name CH, and conditional assiqnraent to whatever variable 
is currently referred to by the name CAND — it specifies 
nothinq about \;bich variables those must be. If PAL, PAT is 
used in a statement of the main program, then it will cause 
assiqnwents to the main-program variables naned CR and C^Mn. 
At a call of the PALIN () procedure, though, those two names 
are made to refer to different variables, internal to the 
procedure call; so if PAL. PAT is used (as above) m a 
statement within the body of PALTIM) , it will 
assignments to the two variables internal to the call. 



cause 




we-"! a procedure call which alters the value of a variable 
not internal to the call is said to have a "side- effect. " 
This terminology exists because of the presumption that, the 
main effect of a procedure is to return a value or to direct 
the flow of control: in fact, however, procedures are ott-n 
written solely for the purpose of producing side-effects. 



One reason for defining a procedure which produ 
side-effect is to keep some sort of record of occur 
inside and outside of procedure 



calls. For 



instance 
s 

s 

c 



COUNT procedure presented earlier could be changed 
in addition to its former action of returning as its 
the number of instances of some pattern within some 
it al=o increments an external counter by that number. 
new version of COUNT {) , TCOnNT () , could be writ 
follows. 



ces a 
renco s 
, the 
o that 
value 
trim , 
This 
ten as 



DEFINE ('TCCUNT (PAT, LINE) ' ,' PR.TCOUNT 
PR.TCOUHT LINE PAT = NULL 

TCOUNT = TCCUNT *■ 1 
OUT.TCCnNT TALLY - TALLY ♦ TCOUNI 
END. TCCUNT 



) : (END.TCOniJT) 
F (OUT. TCCUNT) 

(PR.TCOONT) 

(RETURN) 



Aside from the 
TCOUNT, this procedu 
first version of COU 
procedure increment 
TALI Y by the valuf^ o 
not an internal v 
throughout a program 
thus represent a tot 
that procedure; fo 
incremented by oth 
calls to other proce 



syste 
re df^fi 
NT() , ^ 
s the 
f the r 
ariablo 
over r 
al of t 
r that 
p r a s s i 
dures a 



matic 
nitio 
xcept 
valu 
esult 
, it 
epeat 
he re 
mat 
gnmen 
s wel 



repl 
n is t 

that 
e of 

varia 
s val 
ed cal 
suits 
ter, 
ts in 
1. 



acemen t 
he same 
before 

the ex 
ble. S 
ue can 
Is to 
of many 
TALLY 
the raai 



of C 
as tha 
return 

terna 1 

ince T 

be i 

TCOUN'^ 

in von 

miaht 

n prorjr 



OUNT by 
t of the 
ing the 
varia ble 
ALLY is 
ncreased 
f and 
tions of 
also be 
am or by 



6A. Prcqrammer-def iiicd Procedures 



8 5 



The inclusion of the? sido-offect. involvinrj TALLY 
specializes the COUNT (^ procedure, and t;ho r-aroe record conid 
be kept without recourse to side-effects by keepina the 
tally entirely in the main proqram, as in the segment 

PESULT = COUNT (' A« , RECORD) 
TALLY = TALLY + RE^^HLt 

and so forth. But that requires that the t ally-increTientinq 
statement be written once for every reference to thr. 
procedure; if there are many references to COIINTO in a 
progran, then the whole text can he short^^ned considerably 
by writing the statement which increments TALLY once in the 
TCCriNT procedure body and per mi i: ting the side-effect to 
occur. 

Another reason for changing the value of an external 
variable in a procedure body is to take advantage of p n 
output association v,'hich that variable may have. A SKIP,') 
procedure ran be defined, for evarrule, to "ski;)" the number 
of lines spccifiel by its arqunent iiy asslqning the null 
value repeatedly to the main- progrrr, variable name;! (lUTpiT. 



pn.J^KIP 
END. SKIP 



DiFTKr; (•.'?KTT> (!»];•■)•,' py?. SKIP' ) 

Nll« ^ GT(lIlT^i,0) Iirill - 1 
OUTPUT = NULL 



(KUn.SKTP) 
F (RP'T-UPW) 
(PP. ni'TP) 



If SKIP() is called in the sequence 

OUTPUT = HEAD1 

SKIP (1) 

OUTPUT = FiEAP2 

then the first heading, the throe empty lines, and the 
second heading are all written to the sane file, the one 
with which the variable or»TPUT is associated, since the 
variable referrei to by the name OUTPiJT is the sane both 
inside and outside the procedure cull. Nore tliat SKTP() 
would not uork as intended if OUTFU'" were declared to refer 
to a variable internal to thr. procedure call, since th'^ 
association is with the main-progran variable, not with the 
name OUTPUT. 



Quit'-? a diff(>r(»nt motivation for si d'^-ef f ect ;-. arisns 
when a procedure does not h.av a fi::ei name of an eK<.->rn,il 
variable in its procedure body, but rather c<tn chincje the 
values of different variiibles when it is caJled with 
different arguments. 



6.H. Progratnnier-def ined Frocedurps 



86 



One way to do this is to define a procedure which has a 
string as its argument and which uses indirect referencing 
within its procedure body to refer to an external variable 
named by that strina, or by a string derived from it. 
Consider the folloving STOuE () procedure, whose purpose Is 
to store the string which is its first argument as the value 
of one of a set of successively-named variables; the name of 
the variable which is to be used is formed hy concatenating 
the length of the string to be stored, then the value of thp 
second argument of STORE (), then the index number of the 
next available successively-named variable of the set. If 
the procedure reference 

STOPE(«CAT«, 'LIST') 

is written, for instance, and CAT is the first three-letter 
word to be stored, then it will become the value of the 
variable named 3LIST1. If STORE () were called repeatedly 
with the string LIST as its second argument, then it would 
store one-character strings as the values of the variables 
1LIS11, niST2, ..., $(1 •LIST* N) , two-character strings as 
the values of 2LIST1 , 2LiST2, ..., S (2 'LIST* N) , etc. The 
FJTOPEO procedure further keeps track of the last used index 
ni'roker for each 'list' by storing these numbers as the 
values of the variables ILIST, 2LIST, ..., J(N 'LIST'). Note 
that all names formed by the STOHE (} procedure depend on the 
value of its second argument, but all begin with a number 
and £0 are necessarily distinct from any names which may be 
written in the program text. 

The definition of the STORE () procedure could be 

DEFINECSTORE (WORD, NAME) •',' ER. STORE') : (END. STORE) 

* ADD ONE TO THE INDEX NUMBFR FOR THIS SIZE WORD LIST 
PR. STORE $ (SI-ZF{WORD) NAME) = $ (SIZE (WORD) NAME) ♦ 1 
* 

* STORE THE WORD AS THE VALUE OF THE "MEXT" VARIABLE 

$ (SIZE (WORD) NAME $(SIZE(WORD) NAME)) = WORD 
+ : (RETURN) 

END. STORE 

STORE is thus a procedure which always succeeds, 
returning the null valu«. Its purpose is always to have the 
side-effect of changing the value of one of the great many 
external variables whose names are dependent on the various 
values of its second argument. 



6A. Prcqrainmer-def ined Ptocfrluros 



87 



l52Si§_2i_II!L£rnal_Variablps_j_ 
to use variables oth'=>r than" thor.o 
to refer to their values or to ass 
then the particular relation hetw 
any tinio beccses iir.portant. In ths 
GxaniFles have assumerl that a p 
main progron, and thus all na 
variables internal to the proc 
variables p.ssociated with the 
situation nay he more complica 
procedure mey be called and th 
procedure: if th« second proce 
variables other than its own 
posjfiblity exists that it may use 
of the internal variables of the p 
rather thar- to a tiiain-proqratn va 
them. Scmetimes this is what was i 
care must be lokon to insure 
proc6d\!res viil always refer to th 



When a procedure 
internal to itself 
itT(n new values t 
Gen nar.os and vari 
precediag secti 
rocedure was calle 
mes either refer 
cdure call, or 
uiain proqram. P 
ted than this, bee 
en it may call 
dure makes refer 
internal variable 
a name which refer 
rocedure which cal 
riable external to 
ntended and someti 

that t h P" n a n e s 
e intended variabl 



call is 
, either 
o t h e in , 
ables at 
ens the 
d from a 
red to 
else to 
ut the 
ause one 

another 
ence to 
s , t h e 
s to ( ) n p 
led it, 

both of 
mes not; 

used by 
es. 



The nunber of sots of internal variables which have 
bcccRe temporarily accessible at 



any point in time durinq 
execution is terroed the "level" of execution. When a 
hoqins oxecutinq, it 



proqram 

is at level zero and the statoMonts 

f-XGCuted at level zero are the technical definition of the 

main proqrani. Ifi a statement of the main proqrara cails a 

procedure, the statements of that procedure's body will he 

executed at lev^l one; if that procedure calls a s(>rond 

procedure before returning, then the statements of i-h^. 

second procedure's body will be executed at level two. Wh<^a 

the second procedure does a return, the first procedure will 

resume execution at level one; when it returns, the R.^in 

proqram will resume execution at level zero. It may then 

call another procedure which will execute at level onp, and 

so rorth. Any number of levels y.^.y be attained; there is no 

level lower than 2ero, however, so any attempt to do a 

return from a statement of the main proqram (caused by 

allcwinq control to flow into a procedure body by accident 

rather than throuqh a procedure call) will cause an 

execution-tine error. such an error can be caused by 

neglecting to write an unconditional transfer followinci a 

DEFINED procedure in any of the above examples. 

At different titces a procedure may be expcutpd at 
different levels, depending on the iPnqth of the chain of 
calls by which it was reached. The only chanqp in nxpcuting 
at, different levels is in the variables to which names 
refpr. A procedure executing at level throp, for example, 
VI 11 be executing in an environment in vhich most nani'-s 
refer to ma in- prog ram variables, but some names refer to 



6A. Programmer-defined Procedures 



88 



variables internal to whatever procedure call is at level 
one, some names refer to variables internal to whatever 
procedure call is at level two, and some names refer to its 
own internal variables at level three. If this same 
procedure is later called directly from a statement of the 
main ptogram, then all names except those of its own 
internal variables will refer to main-program variables. 
This difference in environment must be considered to assure 
that a procedure will refer to and assign values to the 
intended external variables, no matter from what level it is 
called and no matter which procedure (and thus what names of 
internal variables) are at levels below it in any particular 
chain of calls. 

As an illustration of the same name referring in 
different environments to variables at three different 
levels, consider an improved version of the PMIH() 
procedure, PALIND(), which would delete all spaces and 
punctuation characters from its argument before testing it 
for being a palindrome, thus allowing strings of the form 
DOC, NOTH. I DISSENT. A ^AST NEVER PREVENTS A FATNESS. I 
DIET CN COD to be accepted. In the complete program below 
the name CAND is used to refer to the trimmed record read 
from the input file, to the formal variable of the PALINDO 
procedure, and to a formal variable of the DELETE () 
procedure which is called by the PALIND() procedure to 
perform the deletion. Nevertheless, there is no possibility 
of the name CAND referring to a variable at the wrong level: 
within the PALIND() procedure (in this example) it always 
refers to an internal variable at level one, while within 
the DELETE procedure it always refers to an internal 
variable at level two. The level zero variable named CAND 
can thus be referred to only by statements of the main 
program. 

DE FINE ( »PALIND(C AN D)CH»,'PF. PALI ND») 

♦ 

* SET DP PATTERN NEEDED BY THE PA1IND() PROCEDURE 

* ASSIGN IT TO A MAIN-PROGRAM VARIABLE 

PAL. PAT = POS(O) LEN(1) $ CH RTAB(1) . CAND *CH 
^. : (END.PALIND) 

* 

* CALL DELETE TO REMOVE SPACES AND PUNCTUATION FROM ARG 
PR.PALIND CAND = DELETE (ANY (' n. ,:;•)# CAND) 

« 

* PROCEED AS IN THE PMIN () PROCEDURE 

LOOP.PALIND LE(SIZE(CAND) ,1) : S(RETnRN) 

CAND PAL. PAT : F (FRETURN) S (LOOP. PALIND) 
END.PALIND 



6R4 Frcgramrncr-def ined Procedures 



89 



DEFINE (« DELETE (P AT, C AND) ♦, 'PB. DELETE') 

* : (END. DELETE) 

♦ RFMOVH ALL PATTERNS FRO^ THE CANDIDATE 

PR.nrLETE CAKD FAT = NULL ; S (PR. DELETE) 

DELETE = CAND . /RETURN* 

T^ND. DELETE (KfcTURN) 

* 

* PAIN PART OF PROHRAH 

- * 

♦ READ ALL RECORDS PUT PRINT ONLY THE PALINDROMES 
RE^C CAND = TRIM (INPUT) : F (END) 
"^J-l^'T OUTPUT = PALTND(CAND) CAND : (READi 
END 

In this program the two DEFINE() statements, the 
assignment to PAL. PAT, the READ statement, the print 
statement, and the END statement constitute the complete 
mam program. These statements are executed in the order 
specified by the go-to' s until an attempt is made to perform 
the assignment in the PRINT statement; before thi<3 
assignment can occur, the value of the call to the PALIND() 
procedure must be obtained. This call causes the variable 
named CAND, intern;^! to level one, to be assigned the samp 
value as the main-program variable CAND, that is, tho 
candidate to be tested, and a transfer to be taken to 
PR.PALIND. Before the assignment specified in this statement 
can be performed, however, a call to the DELETE () nroroinrr. 
must be processed. This causes the variable named CAKD 
internal to the level two call of DELETE () to be assigned 
the same value as that of the level one variable C^ND, thf 
string to be tested. This string is searched repeatedly for 
spaces and punctuation characters and when all have been 
deleted the resulting, possibly shortened, spring is 
returned to the statement PR.PALIND where it is assigned as 
the new value of the level one variable CAND. The value of 
this variable is then searched, perhaps repeatedly, for th<^ 
PAL. PAT pattern; each time the search is successful, the 
value of the level one variable CAND is shortened by tho 
los- of Its first and last characters. If the candidate is 
indeed a palindrome, then the final value of the level one 
variable CAND will bp a string of one or zero character- 
the PALTNDO procedure will take the success return and 
transfer back to tho statement labelled PRINT. Here the 
value of the level zero variable named CAND, the'originil 
string as if war. r.'.id from tho ini-ui file, is ..lintod 
whenever PALiNDO succeeds. 



6A. Prcgrammer-def ined Procedures 90 

Output from this program could be strings such as 

CIVTC 

SUMS ABE NOT SET AS A TEST ON ERASMUS. 

BOTCH 

DEIflED 

DENNIS AND EDNA SINNED. 

There are two different ways of classifying variables, 
which are useful in different descriptions of procedures. On 
the one hand, there are ma in- prog ram variables, at level 
zero, as opposed to the internal variables at higher levels; 
it is the level zero, or main-program, variables which have 
the lasting values associated with all names, while internal 
variables at all higher levels become accessible only 
fpmcorarilv during procedure calls and are initialized anew 
at each call. On the ether hand, from the viewpoint of 
discussing any particular procedure call, the distinction is 
between names of internal variables which are always its 
own, as opposed to external variables which may be different 
variables when the procedure executes at different levels. 

The important special case in which these two 
descriptions are equivalent is for procedures executing at 
level one; at level one, the external variables are all 
main-program variables. The fact that external variables 
cannot be guaranteed to be main-program variables at level 
two and above without a painstaking check of the names of 
all internal variables through all possible chains of calls, 
is one reason for avoiding unnecessary references to 
external variables in procedure bodies. 

The ase_of NRETURN_to_Return_a_ Variable^ Any procedure 
call which returns a non-null string (or an object of 
datatype Name) may occur to the left of an assignment sign 
as the operand of an indirect referencing operator. This was 
indicated in Chapter 5 with the rule 

$SIZ5 (WORD) = $SIZE{WORD) + 1 

and may be further illustrated by the rule 

SCOUNT (ANY (VOW?LS) , WORD) = $COONT (ANY (VOWELS) , WORD) ♦ 1 

which adds one to the value of the variable named by the 
number of vowels found within a word. As another example, 
the statement 



6A. Prcgrammer-def ined Procedures 91 



STHIMdNFHT) = LINE1 : F(DONP) 

assigns the value of LINE1 to the variable named by the 
characters of the next trimmed data record, or causes an 
execution-time error if the trimmed record is null. 

Programmer-defined procedures can he written specifi- 
cally for the purpose of returning a string which will be 
used as the operand of the $ operator to return a variah»lo. 
Consider, for example, the prcbleni of determining the first 
null-valued variable of the set LIST1, I,IST2, ..., $(«I,T'^T« 
N) , described in Chapter 5, and then assigning that variable 
the value of the next data record. A procedure named 
NEXTNULT, might be written to determine the first null- 
valued variable as follows. 

DEFINE ('NEXTNULL(NAMI) H' ,'PR. NEXTN'ILL* ) 

♦ : (END. NEXT NULL) 
PR.KEXTNHLL N = N ♦ 1 

NFXTNULL = IDENT{$(NflKE H),NULL) NA'-IE N 

♦ : S(EETORN) F(PR.NEXTNnLL) 
END.NEXTHULL 

The NEXTNUT.LO procedure cannot fail so it may be used 
in a staromont of the form 

$NKXTNULL ('LIST') = TRIM (INPUT) : F(N0nA7A) 

The procedure is called vith a string- valued argnnent 
representing that part of the name which is common to all 
the variables. This string is concatenated to the value of 
the variable N internal to the procedure call, and the 5 
operator is applied to the result cf this concatenation to 
return a variable. If the value of this variable is null, a 
string representing the name of the variable is formed by 
concatenation and assigned as the value of the result 
variable; this string is returned as t]»o value of the 
procedure call where it is used as the operand of the J 
operator which returns the variable needed to perform the 
assignment . 

Since N is declared as internal, it is assigned the 
null value every time the NEXTNI1I,L() procedure is called, 
hence the search for the "nr>xt" variable always begins from 
one. Tf the search were to bogin from tlio value giv*^n N the 
last time the procediiro returned, i.e., from t h<- last 
variable located, then N should net be declared as internal 
so that it would retain its value from on(> procedure call to 
the next. 



6A. Prcgrammer-def ined Procedures °2 



A procedure can be caused to return a variable, rather 
than a string which can be used by the $ operator to return 
a variable, with the use of the name return NRETURN. This 
return may be used only if the value of the result variable 
is a string (or a Name); it effectively applies the $ 
operator to the value of the result variable, causing the 
variable named by that value to be returned as the value of 
the procedure call. Using NR^TURN, the NEXTNULI, () procedure 
may te written as follows. 

DEFINE (• NEXTNDIL (NAME) N« , 'PR. NEXTNOLL' ) 
+ : {END. NEXT NOLL) 

PR.NEXTNULL N = N ♦ 1 

NEXTNOLL = IDENT ($ (NAME N) , NULL) NAME N 
♦ : S(NRETnRN) F(PR.NEXTNULL) 

END. NEXTNOLL 

This version of NEXTNOLL () is exactly the same as its 
predecessor except that NRETOBN has been written instead of 
RETOFN in the last statement of the procedure body, causing 
the variable named by the string formed by concatenating the 
value of NAME and N to be returned, rather than that string. 
A reference to this new NEXTNOLL () procedure would have the 
form 

NEXTNULL{'LIST«) = TRTM(INPUT) : F (NODATA) 

The $ operator is now not wanted before the procedure 
reference since NRETORN has effectively applied it already. 

NRETDRN is provided for convenience only; its effect 
may always be obtained by using RETORN within the procedure 
body to return the name of a variable, and by placing a J 
operator directly before the procedure reference. Further 
examples of the use of NRETORN may be found in Chapters 7 
and e. 

The APPLYJX_ Pr oc edure^. A procedure reference in a 
program "text is composed" of a procedure name followed 
directly by an argument list enclosed within parentheses. 
Although these arguments may be represented by arbitrarily 
complex expressions, which when evaluated yield appropriate 
values, the procedure name may not be so represented but 
raust be an identifier. 

There are some applications, however, in which the 
programming would be much simplified if one could indicate 
generally, rather than specifically, which procedure is to 
be called. Consider, for example, a series of procedures 
named FTX1, FIX2, FIX3, etc., each one designed to "fix" a 



6A. Programmer-defined Procedurns 93 



word of the indicated length. R orocedure call somethirq 
like $(«FIX' SIZE{WnPD)) (WORD) is what is needed in order to 
call the appropriate procedure for any given word, hut this 
expression is syntactically incorrect. 

Assigning an expression representing the procedure name 
to another variable, as in 

TEKP = 'FIX* SIZE(WCRD) 

and then applying the $ operator as in $TEHP(Wonn) give-^ an 
expression which is syntactically correct hut does not 
producfj the desired result; in this case the procedure call 
TEMP (WORD) is evaluated, and its value used as the operand 
of the $ operator. (Of course, if no procedure TEMPO were 
defined — the most likely case — an execution-ti n-e error 
would result when it was called.) 

A way of calling a procedure, in which the name of the 
procedure to be' called is determined at execution- tine, 3s 
provided by the predefined procedure APPLY () whosp firnt 
argument may be any expression which yields a r.tring naninq 
the procedure to he called, and whose remaining arguments 
are any expressions representing the arguments to bo 
supplied to that procedure. APPtY () may be applied to 
predefined procedures as well as to prcgramraer-def in'^d onos- 
thu£? ' ' 

WORD = APPLY ("TRTMi^iNpuT} 

is equivalent to 

WORD = TRIM(INPUT) 
and 

OUTPUT = APPLY (•L0NGFP',STRING1,STPING2, VOWELS) 
is equivalent to 

OUTPUT = LONGER (STRING1,STRINR2, VOWELS) 

More usefully, the designation of the appropriate 
procedure from the set FIX1, FTX2, FIX3, etc., could he made 
with the evaluation rule 

APPLY {'KTX' ST7E(W0nn) ,wniJn) 
which is equivalent to the rule 



6A . PrcgraiTiiner-def ined l-rocedur«=»s 



FIX3(W0nD) 



9U 



if V3CRD has a value three characters long. Similarly, 
executing tlie stdtement 

APPLY (TRIM(INPUT) ,RRG1,ARG2) : F(ERRCR) 

calls the procedure whose name is specified on the next data 
record, giving it the two arguments ARG1 and AR02. 

The value returned by APPLY () is the value returned by 
the procedure which it calls, and APPLY () returns with 
whatever return (RETURN, NRETnRN, or FRETiiRN) is used by 
that procedure. 

Note that APPLY is defined to have a varying rather 
than a fixed number of arguments, always one more than that 
of the procedure specified in its first argument. However, 
the usual rules about missing and extra arguments pertain: 
if the number of arguments beginning with the second exceeds 
the number of formal variables specified for the procedure 
being called, the extra arguments are evaluated but 
otherwise ignored; if i.Uere are fewer arguments than formal 
variables, each remaining formal variable is assigned the 
null value. 

Although the name of the procedure may be represented 
by an expression of any complexity, that expression must 
yield a string which is an identifier when evaluated. This 
restriction comes about because all the names in the first 
argument of the DEFINE () procedure must be identifiers; all 
predefined procedures, of course, have names which are in 
identifier form. 

II si n2_a_^ibrar2 of ProcedureSi Kost tasks vhich a 

program "is to "perform divide themselves naturally into a 
series of smaller tasks, some of which are so basic as to be 
repeated many times during the course of the program. If 
each basic part is written as a procedure, then the 
organization of the program can be clearly seen; the body of 
each orocedL^re need occur within the program text only once, 
but it may be referred to whenever it is needed. Once a 
procedure has been thoroughly tested, it may form part of 
the programmer's "library" to be used, just as the 
predefined procedures are used, as a part of many different 
programs. 

The complete program text below begins by oroviding the 
library of procedures to which it will refer; with the 
exception of the PRINT () procedure, these procedures have 



67a. Proqramner-def ined Procednrps 



95 



all occurred oarliec in this chapter vith t 
definitions. Aftor the library comes the main progra 
consists larqcly of references to these procedu 
purpose of the program is to read data from the inpn 
isolate the words, and store them in "lists" acco 
their size, when all the words have heen read in and 
the lists are printed, in order of increasing wo 
with the words in each list in the order in which th 
encountered. In addition, each word of a list wh 
palindrome is underlined by printing a row of 
beneath it on the succeeding line. At the end of ea 
numhers are printed indicating the number of words 
list and the number of palindromes; when all the li 
been printed, the total number of words and of pal 
is also provided. 



he same 
m, which 
res. Th*^ 
t file, 
rding to 

stored , 
rd size, 
ey were 
ich is a 

hyphens 
ch list, 

in the 
sts have 
indronies 



The main program begins by determining the characters 
which are to be considered as punctuation by reading thom in 
frciii the first record of the input data. It then proceeds to 
read each subsequent data record, which consists of words 
separated by spaces and punctuation and appearing in no 
fixed format, except that no word is broken across a record. 
As each word is found, the ?;T0PR() procedure is invoked to 
store the word in the list appropriate to Its si^e. when all 
the words have been processed, the PUINTO procedure is 
called to print the lists, shortest words first, and to 
underline each word which is a palindrome. T!ie PRINT () 
procedure invokes the PALTM{) procedure to determine whether 
or not the word is a palindrome, the »EPEAT() procedure to 
forir an underline of the needed length, and the SKTP() 
procedure to produce blank lines. The PliINT() f>roced'ire 
counts the words and palindromes occurring in each list by 
incrementing the values of the internal variables W and P, 
printing their values before it returns. It also adds to tho 
total count of words and palindrotoes by incrementing ihc 
values of the main-program variables WOHDS and PALINS; these 
values persist and increase through successive calls to 
PRINT(). 

* FROCEDUPE TO CONCATENATE A STPIHG OR PATTERN N TIKES 
♦ 

DEFINE ( 'REPE AT (N, OBJECT) ',' PR. REPEAT' ) 

*■ : (END. REPEAT) 

PP. FEPFAT N = RT(N,0) N - 1 

REPEAT = REPEAT OFMECT 
FND. REPEAT 

♦ 

* TEST PROCEDURE TO FIND PALlNDROf«ES (FAILS IF NOT A PALTN) 



F fPET(IRN) 
(PR. REPEAT) 



DEFINE (• PALI N(CAND) CM', 'PR. PALIN') 



6j\. Frogratnmer-clef ine'l Procedures ^^ 



* SFT IIP PATTERN K?FDED »Y THE PAtIN() PEOCEDUBE 

* ASSIGN TT TO A MAIN-PROGRAM VAPIABLE 

PAL. PAT = POS(O) LEH(1) $ CH BTAB(1) . CAND *CH 

: (ENn.PAT.IN) 

* IF CANDIDATE MOW CONSISTS OF 1 OR CHARACTERS, SUCCEED 

* OTHERWISE APPLY THE PATTERN AGAIN 

PR.FAIIN LE(STZE(CAND) ,1) : ^^^^^"$^Ln„» 

CAKD PAL. PAT : S(FR.PALIN) F(?RETUPN) 

FND.PAIIN 

* SIDE-EFFECT PROCEDURE TO TO SKIP N LINES ON OUTPUT FILE 

DEI:INE('SKIP(SUK) 'r'PR.SKIP') : (END. SKIP) 

PR. SKIP NUr, = GT(NnH,0) NUH - 1 : F (RETURN) 

OUTPUT = NULL ' (PR. SKIP) 

END. SKIP 

* SIDE-EFFECT PROCEDURE TO STORE WORDS IN LlSTS BY SI7.?. 

DEFINECSTORE(WOnD,NAHE) •, 'PR. STORE') : (END. STORE) 

* ADD ONE TO THE INDEX NUMBER FOR THIS SIZE WORD LIST 
PR. STORE $ (SIZE (WORD) NAME) = $ (SIZE (WORD) NAME) ^ 1 

* «TOFE THE WORD AS THE VALUE OF THE "NEXT" VARIABLE 

$ (SIZE (WORD) HAKE $ (SIZE (WORD) NAME)) = WORD 
' : (RETURN) 

END. STORE 

* PROCEDURE TO PRINT WORDS, UNDERLINE PALINS, KEEP COUNTS 

DEFINE(«PRINT(N,NAME) H, W, P* ,' PR. PRINT • ) 
^ ■■ : (END. PRINT) 

PR. PRINT OUTPUT = 'LISTaOFn* N « -LETTERnSORDS' 

SKIP(1) 

* TEST FOR END OF LIST - IF NOT END, PRINT NEXT WORD 
UP. PRINT H = LT(N,$(N NAME)) H ♦ 1 : F (DONE. PRINT) 

OUTPUT = $(N NAME M) 
* 

* ADD ONE TO THE WORD COUNT FOR THIS SIZE 

W = H ♦ 1 

* UNDERLINE WORD IF IT IS A PALINDROME „„t„-p» 

OUTPUT = PflLIN(OnTPUT) REPEAT (N, •-• ) : F(UP. PRINT) 

* ADD ONE TO THE PALINDROME COUNT FOR THIS SIZE 

p = p ♦ 1 : (UP. PRINT) 

* ALL WORDS HAVE BEEN PRINTED - PRINT THE COUNTS 



6A. Prcgirammer-def insd Procedures 97 



DONE. PRINT 5;KTP(1) 

OUTPUT = W 'nnn' N '-lETTERnWORDf^ • 

OUTPUT = TDENT(P,NULL) 'Orjnn' N '-LETTER* 

* »nFALIKDROME?^« : S(W. PRINT) 
OUTPUT --= P 'uaa* K • -LETTHRnPALINDROMES* 

* ADD THESE TOTALS TO THE COUNTS FOR ALL SIZES 

PALINS = PALTNS ♦ P 
H.EFINT WOPDS = WORDS + W 

SKIP(2) : (RETURN) 

END. PRINT 
* 

* flAIN PART OF .PROGRAfl 
* 

* INITIALIZE PY DE-^ERMINING THE PUNCUTATION CHARACTERS 

* AND FORMTNO A WORD-FINDING PATTERN 

PUNC = 'n' TRIM (INPUT) : F(ERROR) 
WOLD. PAT --= BREAK (PUNC) . WORD SPAN (PUNC) 

* KAIN READ LOOP - GET THE NEXT RECORD 

REAC RECORD = TRIM (INPUT) »n' : F(LIST) 

* REMOVE ANY INITIAL SPACES OP PUNCTUATION 

RECORD POS (0) SPAN (PUNC) = NULL 
* 

* GET THE NEXT WORD 

NEXIWORD RECORD WORD. PAT ^ NULL : F(PEAD) 
* 

* SAVE LENGTH OF LONGEST WORD IN MAX 

MAX = GT(SIZE(WORD) ,MAX) SIZE(WORD) 
* 

* 5T0RE THE WORD IN THE LIST FOR ITS SIZE 

STORE (WORD) ' ; (NEXTWORD) 

* PRINT THE LISTS, SHORTEST ONES FIRST 

I-IST N = LT(N,MAX) N + 1 : F (FINAL) 

* IF THERE ARE WORDS OF LENGTH N, PRINT THEM 

(DIFFER(T(N 'LIST') , NULL) PRINT (N, • LIST' ) ) 
*■ : (LIST) 

* PRINT SOME FINAL STATISTICS, PREPARED BY PRINT() 
FINAL OUTPUT = ' TOTA InNUMRERnOEnWOHOSn — n' WORDS 

OUTPUT = 'TOTALnNUMnERnOEnPALINDRO.»^ESn — n' PALTNS 

*■ : (END) 

* 

ERROR OUTPUT = 'NOnDATA* 
END 



6A. Ptcgraramer-c^ef ined Procedures 

If the input to this program were the rjuestion 
DID THE NAME ADA T^EFER TO A VAPIABLE AT LEVEL 1 OR LEVEL 2 
then the ontput yould be as follows. 
LIST OF 1-LFTTEP WORDS 
A 
1 
2 



3 1-LETTEP KCRDS 

3 1-LETTEP PALINDROMES 



LIST OF 2-LETTER WORDS 

TO 
AT 
OR 

3 2-IETTER WORDS 

2-LETTER PALINDROMES 



LIST OF 3-LETTEE WORDS 

DTD 

THE 
AEA 



98 



3 3-LETTER WORDS 

2 3-LETTER PALINDROMES 



LIST OF U-LETTER WORDS 
NAME 

1 U-LFTTER WORDS 

Il-LETTER PALINDROMES 



6A. Proqrainmer-<^ef ineii Procc»dnres 



qq 



LISl OF 5-LETTEP WOBDS 

BEFEP 

LEVII 

LEVl'L 



-■? 5-LETTE'? W0RD5> 

3 "i-LETTE!^ PJ»LINDROHES 



LIST OF 8-LETTEF WORDS 

VARTfiELF 

1 8-lETTEH WCPD?; 

8-LETTE]^ PALINDROMES 



TOTAL IJOflBER OF WORDS — U 
TOTRL NUiiQFR OF PALTNCPOMES 



— R 



100 



7A. ARRAYS 

The programming of sobg problems can be greatly 
simplified with the use of sets of successively-named 
variables, such as those described in Chapters 5 and 6. 
There, indirect referencing was used to refer to variables 
with some set of names such as LTST1 ,LTST2, ...,$(♦ LIST* N) . 
The variables could be thought of as forming a set because 
their names were composed of two parts, where one part was 
common to all names of the set and the other part varied; 
the variables were said to be successively-named because the 
varying part was an integer which differed by one for each 
member of the set. The notion that the variables with names 
differing in this way were logically associated was, of 
course, simply a convention adopted by the programmer. But 
the idea of a set of variables associated together, with the 
selection of any one of them dependent on the value of an 
arithmetic expression, is so useful that data structures of 
this sort are predefined in Snofcol, under the name of 
Arrays. An array is used very much like a set of variables 
with successive names, except that tlie convention that the 
variables constitute a set is net the programmer's alone, 
but is shared by the Snobol system. Thus it is possible to 
treat the set of variables as a single aggregate in some 
cases, and to make reference to specific variables in the 
set on other occasions. 

Creating an Array. An array is created by executing a 
call "to the predefined procedure ARRAY (). The ARRAY () 
procedure has a single string-valued argument, which in its 
simplest form is used to specify the number of variables of 
which the array is to be composed. For example, execution of 
the rule 

LIST = ARRAY (•1000«) 

causes an array of 1000 variables to be created; this array 
is returned as the value of the ARRAY () procedure and the 
entire aggregate is assigned as the value of the variable 
named LIST. 

The variables forming an array are distinct from other 
variables in that they do not have names which can be 
written directly in program texts. Rather, they are usually 
represented in a program text by expressions which are 
composed of two parts: the first part consists of the name 
of a variable whose value is the entire "family" of 
variables that make up the array; the second part, called 
the "selector," consists of at least one integer-valued 
expression, called an index, enclosed within square brackets 



7A. Arrays 101 



and imtnediately following the family part of the name. 
Consecutive integer selectors are assigned to each variable 
of the array and serve to select a particular variable from 
the set. Thus variable number three of the 1000- variable 
array which is the value of LIST may be referred to as 
LIS1[ 1], 

When the rule 

LIST = ARRAY (MOOC) 

is executed, the 1000 variables LIST[ 1 ], LTST[2], ..., 
LIST[1000] b=?conie available for use. Each of these variables 
initially has the null value, like any other variable, whon 
the array is created. These variables may acquire new values 
by the usual means of assignment, as in the statements 

LIST[ 1 ] = TRIM (INPUT) : F(DONE) 

LIST[ 1 ] POS(O) SPAN(«n«) = NULL 

RECORD AMY(VOMELS) . LIST[7] : "(NOVOWEL) 



and 



Although all variables of an array are often assiqned 
values of the same datatype, there is no requiromr'nt that 
this be done: some may be assigned strings as values, and 
some Patterns, for instance; such a variable may even hav.o 
an Array as its value, including the array of which it is 
itself a member. 

Array Items and Item References. The variables fomina 
an array are called "array items"; references to th'^sp" 
variables in program texts, exptessions of the form LIST[N1, 
are called "item references." It is important to remember 
that the variables referred to by these item references do 
not have names in the form of strings. That is, the string 
LIST[ 1 ] is not the name of variable number one of the array 
which is the value of LIST. For one thing, such a string 
cannot be written in a program text to represent a name 
since it is not in identifier form. Nevertheless, every 
string is the name of a variable, so the string LIST[1] is 
indeed the name of some variable, v/hich may be represented 
in a program text as $'LTST[1]'; however, this variable has 
no intrinsic connection with any array. 

The variables with strinqs as names are all available 
to a programmer whon execution of a program begins, and are 
called "natural" variables; in contrast, variables which are 
array items must ho explicitly created by a call to th" 
ARRAY () procedure, and in consequence are called "created" 



7A, Arrays 10? 



variables. They have names which are not strings — 
necessarily, since every possible string is the name of a 
natural variable. If the name of a variable which is an 
array item is needed (so that it may be passed as an 
argucient to a procedure, for example) , a special kind of 
non-string Name must be generated fcy the use of the name 
operator described toward the end of this chapter. 

The family part of an item reference, LIST in the 
example above, must always be an identifier and must refer 
to a variable whose value is an array. However, natural 
variables whose names are not in identifier form, such as 
the one represented by $ (CHAR •*•) r and created variables, 
such as the one represented by LIST[3], may be assigned 
arrays as values. Special methods, described later in this 
chapter, must then be used to form references to the items 
of these arrays. Note that references to all items of an 
array are always formed with the use of a single name, that 
of a variable whose value is the array to which they belong. 

Comparison with Ind ir ect Refere nc ing. A set of 

successively-named variables formed with the use of indirect 
referencing constitutes a sort of simulated array. These 
simulated arrays have some advantages over the predefined 
array struci;ures provided by Snobol. 

When indirect referencing is used, it is not necessary 
to specify in advance how many variables will belong to the 
set. That is, in the loop 

NLOOE N = N + 1 

OUTPUT = TEIK (INPUT) : F(ALLGON?) 

$(»LIST' N) = OUTPUT : (HLOOP) 

the maximum value of N is determined only by the number of 
data records read, which may vary with each use of the 
program. 

There is also no restriction that N be incremented only 
by 1 — any interval may be used, not necessarily the same 
one on each iteration of the loop. Thus the statement 
labelled NLOOP above may read 

NLOCP N = N + 2 

or 

NLOOP N = N ♦ SIZE ($ ('LIST* N)) 

or whatever. 



7A. Arrays 103 



Further, there is no necessity to use numeric values at 
all in forming the varying part of a name. For exafiple, the 
"successively-named" variables LTSTft, Lt.str, ..., LT.STZ 
could be used by writing the loop 

ALPHA = • ABCDEFGHIJKLMNOPQRSTUVWXYZ* 
CHARPAT = LEN(1) . CHAP 
LOOP ALPHA CHAPPAT = NULL : F (DONK) 
${'LrST« CHAE) = TRIM (INPUT) : S(LOOP) 

For that matter, there is no need for the variahlos of a 
simulated array to have nauie.'j which are obviously 
"successive." Thus, the varying part of each name could bo 
foriTcd from a list of words which might have no obvious 
relation to one another. Using a word as a "sel'^ctor" of a 
simulated array item provides much more information than the 
use of an often arbitrary number. Lastly, no diff icult if»s 
arise if the "family" part of the names is not in identifier 
form. 

On the other hand, there are some advantages to using 
the predefined array structure. The principal one is that 
the array items are recognized as being related by thi;^ 
Snohol system, so the whole aggregate can be assigm^rl as the 
value of a variable, passed as an ar<mment to a procofliure, 
and so forth. Also, the variables which are array item.s are 
distinct from all other variables since they do not have 
names in the form of strings, sc inadvertant conflicts o^ 
variable usage are easily avoided; and sometimes an itf^m 
reference in a program text gives a more intuitive picture 
of the process being programmed than does an expression 
involving indirect referencing. 

An array is a particularly useful data structure to 
employ when the numeric order of its items is significant, 
e.g., when the n-th item of some list is needed. For data 
which does not lend itself well to being processed in ^f^rms 
of numeric ordering, other types of data structures are 
probably more useful. ways of creating data structures of 
one's own choosing are indicated in the following chapter. 

Mu It i-,d imens i onal Arrays. It is often intuitively 

useful to think of the items of an array as being arranged 
in nore than the single dimension of the LIST example above. 
One might want, for example, to simulate the moves on a 
cheFtboard by using an OxH array which is tho^ value of a 
variable named IKlApn, Such a two-dimensional, bu-itf^m array 
could be created by executing the rule 



7A. Arrays lOa 



BOftRD = ARRAY ('8, 8') 

The first row cf the chessboarcl could then be represented by 
giving values to the it^ms referred to as B0ARD[1,1], 
B0ARD[1,2], ...r B0ARD[1,8]. The proqransiBor is of course 
free to decide which dimension is to be thought of as 
indicating the rows and which as indicating the columns. If 
he prefers the opposite convention, then the first row would 
be the items B0ARD[1,1], B0ARD[2,1], ..., B0ARDr8,1]. 

Similarly, a three-dimensional tic-tac-toe board having 
a 5x5 sguare on each of its three planes could be simulated 
by using the array created by executing the rule 

TTC3 ^ ARRAY ('5,5,3') 

The central cell of this structure is the array item 
TIC3[3,3,21. 

Although it is difficult to symbolize or coriceptuali?:e 
arrays of more than three dimensions, they present no 
programming problems. For each new dimension, another number 
witliin the argument of the ARRAY () procedure is needed for 
the creation of the array; similarly, another index is 
needed within the selector to form an appropriate reference 
for any given array item. There are no limitations on the 
number of dimensions which an array may have, or on the 
number of items to be associated with each dimension. 

Arrays of many dimensions can be used to arrange data 
elements which differ from one another along many numeric 
scales. Each "dimension" is thought of as an "attribute," 
and a data element is assigned to a particular array item 
according to the numeric value of all its attributes. The 
data elements may then be accessed in an orderly manner 
along each "dimension" of the arrangement. 

The ARRAY () Procedure. The predefined procedure ARRAY () 
requires a single string-valued argument which provides a 
prototype of the array, specifying (implicitly or 
explicitly) the number of dimensions the array is to have 
and the range of index numbers which may be used to select 
items of this array in each dimension. Unless otherwise 
specified, it is assumed that the indexing in each dimension 
starts with 1. However, if the arrays described above as 
being the values of LIST, BOARD, and TIC3 were to be indexed 
from zero instead of from one, but were still to have the 
same number of items as before, this could be specified by 
executing the rules 



7A. Arrays 



10' 



LIST = ARPAY ('0;999«) 
BOARD = ARRAY ('0:7,0:7') 
TIC3 = ARRAY ('0:U, 0:4, 0:2') 

The cclon within the argument is used to separate the lovRst 
index number from the highest index number for each 
dimension; the comma is used to separate the different 
dimensions from one another; no embedded blanks are 
periritted. 

Negative numbers may be used within the prototype of an 
array, and consequently within the selectors of its items. 
Execution of the rule 

NEGARR = ARRAY (-50:-S) 

creates a 4G-eleraent array whose items may he referred to as 
NRGARR[-50 1, NEGARR[-a9], ..., NEGARRf-S], (Mote that those 
references are arranged, as always, in ascending arithmetic 
order.) 

Information about the range of index numbers in each 
dimension may be provided in terms of any expressions wliich 
give the desired numbers when evaluated. These indices nay 
be positive, negative, or 7ero, but the upper bound tor any 
dimension must always be greater than or ec^nal to tho 
corresponding lower bound; consequently an array must always 
be ccraposed of at least one item. Thus the rules 



ARRAY1 = ARRAY (SIZE (WORDl) 
ARRAY2 = ARRAY {^^ ': • N1 
ARRAYS = ARRAY (A + B ', • C 






I f 



SIZE(W0RD2)) 
M2 •: • M2) 



may each specify the creation of a two-dimensional array, if 
the expressions within the argument of each ARRAv() 
procedure have appropriate numeric values at the time the 
rules are executed. 



quote 
conca 
singl 
quote 
argum 
first 
ARRAY 
r e t u r 
of i 
and A 
of da 



Note th 
s to 
tenated 
e argu 
s, each 
ent f o 
would 
req 
n s as i 
ts arq 
RRAY.l i 
tatype 



at the com 
indicate t 

into the s 
nient. If 

comma woul 
r the ARRAY 
be evalu 
uiros only 
ts value an 
tment. Thu 
n the above 
Array. 



mas and colons are 
hat they are literal 
trinq being formed 
the commas were no 
d indicate the prese 
procedure; all arg 
ated but otherwise 
one argument. Th^^ 

array created to the 
s the variables named 

example would all be 



placed 


withi 


n 


characters to h 


p 


to provide th 


o 


t place 


d within 


nee of 


anotho 


r 


uments a 


fter t-h 


o 


ignored 


, sine 


,-i 


array p 


rocedur 


n 


specif icat ion 


s 


ARRAY 1, 


ARRAY? 


f 


ass iqne 


d Vc^luf 


f 



7R. Arrays 



106 



Selectors. Selectors may also consist of any 
expressions which yield the desired index (or indices) when 
evaluated. Thus 

LI Sit 1 ] 

LTST[A ♦ B] 

LISTr SIZE (TT^IH (CARD) ) ] 

LIST[$LIST[2]] 

I.I5T[LI5TC1.IST[2]1] 

are all item references which may be used to refer to 
variable number one of the array which is the value of LIST 
if the expressions A + B and SIZE (TRIM (CARD) ) and $LIST[2] 
and LIST[LTST[2]1 all have the value 1 when the rules in 
which the above expressions appear are executed. 

Although the prototype of the array is expressed as a 
strinq, note that the selector of an item reference is not; 
rather the expressions representing the indices are 
separated by commas, much like the arguments of a procedure 
reference. ?hus BOARD[X,Y] is an appropriate item reference 
for a two-dimensional array, while BOARD[X ',' Y ], which 
specifies a non-integer index, is not. An execution-time 
errcr will occur if a non-integer results from the 
evaluation of the index for any dimension, or if the number 
of diirensions indicated by the selector is not the same as 
the number specified by the prototype for that array. 

?ailure_of_aji_Item_Ref£rencei An attempt to evaluate an 
item reference may fail, causing failure of the rule in 
which the evaluation occurs. An item reference fails when 
its family part refers to a variable whose value is an 
array, but its selector yields an index for any dimension 
which falls outside the range specified by the prototype of 
that array. Thus the rule 

OUTPOT = IIST[N] : F(DONE) 

will fail and send control to DONE for values of N which are 
less than 1 or greater than 1000 for the value of LIST 
described at the beginning of this chapter. The simple two- 
statement loop 

LOOP N = N + 1 

ODTPUT = LIST[N1 : S (LOOP) F(DONE) 

can therefore be used to print the values of all items of 
the array referred to by LIST (provided these values are all 
strings). Here the fact that the item reference can cause 
failure of the rule eliminates the need for a statement of 



7A. Jirrays 10"' 

the fcrra 

N = LT(N,1000) N *■ 1 : F{DONE) 

to terminate tho loop anrl so sofnewhat simplifies the 
progtaiticning. (Note that the valu<^r, of all the items of an 
array cannot be printed by a rule of the ^'orm OUTPUT - LTST, 
since LIST has an array as its value, and only strings can 
be printed.) 

Often reliance en the failure of an item reference 
rather than on the failure of some test procedure does not 
simplify the proyramming and may lead to logical gilols. Ful 
example, the loop 

FIII1 N = N ♦ 1 

LISTrW] = TRIH(TNPUT) : F(FULL) S(FILLI) 

will fail and send control to FULL (I) when the value of N 
beccr.es greater than 1000 or (2) when the data is exhausted, 
without making the (often necessary) distinction between tho 
two cases. The fact that an item reference can cause failure; 
of the rule must always be kept in mind to prevent the 
writing of rules which may fail for more than one reasoii. 

5iLgSiai_£l£l2l?.i!!S Concern ir^a Item ^eferenres^ It is 

possible to assign an array as the vr.lue of a variable who.se 
name cannot be represented in identifier form, eithnr 
because it contains impermissible characters, as in 

$«A/1« = AFPAY(MGOO') 
or because it is a created variable, as in 

LTST[ 1 ] = ARRAY (MOOO') 
or because it is unknown, as in 

SWORD = ARRAYC 1000') 

Although each of the above rules creates an array of 
1000 items and assigns it as the value of some variable as 
in all previous examples, the items of thoso arrays may not 
be referred to in the usual manner, since thoro is \ 
restriction that the family part of an item reference raust 
bo a name in identifier form. Thus if one attempts, for the 
first two cases above, to write rules of the form 

$«A/i«[i] = Tr(i«(TNrnT) 
and 



7A. Rrrays 108 

LIST[1][1] = TRIM (INPUT) 

then corapilo-t inie errors result. 

Writing, for the third case, the rule 

$W0RD[1] = TRTM(INPOT) 

dees not result in a compile-time error, but does not give 
the desired result either. Here, the operand of the indirect 
referencing operator is not the variable WORD, as is 
desired, but rather the item reference WOPDfl], The 
evaluation of WORDf 1 ] should cause an execution-time error, 
since the variable WORD was intended as the operand of the 
indirect referencing operator, and thus its value should be 
a string or a Name, not an array. 

All of these cases may be taken care of by simply 
assigning each array to another variable, one whose name may 
be lepiesented by an identifier. Each of the erroneous rules 
presented before can thus be replaced by a pair of rules, 
such as the following: 

TEMPI = $»A/1« 
TEMP1[1] = TRIM(INPUT) 

TEMP2 = LTST[1] 
TEf1P2[ 1 1 = TRIM(INPnT) 

TEHP3 = SWORD 
TEMP3[1] = TRIM (INPUT) 

Note that assigning an array to a second variable does 
not cause a new array to be created, but merely allows two 
(or more) variables to have the same array as their values. 

?hg-.IT^Wf) .Procedure,., The TTE!1() procedure provides 
another method of referring to the items of an array when 
the array has been assigned to a variable whose name cannot 
be written in identifier form. The ITEM () procedure, like 
the APPLY procedure described in Chapter 6, has a varying 
number of arguments, usually one more than the number of 
dimensions of the array involved. The first argument must be 
an expression whose value is an array; the remaining 
arguments may be any integer-valued expressions, usually one 
for each dimension of the array, given in the appropriate 
order. ITEI1() returns as its value (by NRETURN) the variable 
specified by using its first argument to indicate a family 
and its remaining arguments together to form a selector. 
Thus the expression ITE^(LIST,1) is equivalent to the 



7A. Arrays 10^ 



expression LISTfl], an.1 ITEM (BOARE, 8, «) is equivalent to 
EOARD[8,8]. More usefully, the rules 

ITEM ($'A/V,1) = TiaM(INPOT) 

ITEM(LIST[ 1 ],1) = TPIM{INFnT) 
and 

ITEM (.fWORD, 1) = TRIM(INF(IT) 

coulfl all be used in place of the rules involving TEMPI, 
TEMFP, and TEMP3, above. 

A procedure reference to ITEM {) may be written wherever 
an item reference may appear. Thus the rule 

OUTPnT = TTC3[X,Y,Z] 

may fce written as 

OUTPUT = ITEM (Tin, X,Y,Z) 

with the same effect. ITEM() fails, in just the vray that an 
item reference fails^ it the index for any dimension vithi.n 
the selector which is formed falls outside the rancid- 
specified by the prototype of the array involved. 

Although the selector part of an item reference must 
consist of a list of indices separated by commas, ac in 
TTC3[XfY,Z% and may not be expressed as a concatenated 
string, as in TIc:^[X •,• Y ',' 7.], the ITE1 () proceciure 
allows the selector to be represented by either method and 
even by combinations of the two. Furthermore, TTT^M() doo?^ 
not require that the proper number of index expressions he 
present in its arguments. It uses only as many indices a:; 
are appropriate for the array given as its first argument; 
it assumes the value zero for missing indices, and evaluates 
but otherwise ignores the expressions for extra indices. 
Thus the number of arguments with which ITEM () may be called 
can vary not only with the nuraber of dimensions of the array 
being indexed but also with the choice of representation for 
each index. The four-argument call 

ITEH{TIC3,X,Y,Z) 

has the same effect as either of the three-argument calls 

ITEH(TIC3,X •,' Y,7) 
or 

ITEM(TIC1,X,Y »,• Z) 



7A. Ariray.s 11^ 

or the two-arqume!it call 

itt:m(ttc3,x ',' y ',' 7.) 

Each returr.s the iten TIC3[ X , Y ,7, ] as its val'ie. The 
importance of this feature is illustratetl by an pzawplft at 
the end ol: this chapter. 

can accept as its single arguir.ent any expression whose value 
is cf datatype f-.rray, and returnr, as its value a string 
giving the prototype of that array. This prototype will oe 
the sane as the one specified in the call to rhe ARRAY () 
procedure which caused the array to be created, except that 
the lower bound for each dirnensicn is always explicitly 
expressed, and the integers specifying the bounds are in 
canonical form (a sign retained only for negative numbers, 
leading zeroes suppressed, and 7.ero represented by the 
single character 0) . Thus if the rules 

BOARD = ARRRY('08,09') 
TIC? = ASPA.Y (•5,'">,3 ') 
LIST = ARPAY ('0:999') 
KEGARP = ARBAY ('-50:+5») 

have been executed, then execution of the rules 

OUTPnT - PROTOTYP'=; (POAPC) 

OUTPUT = PPOTOTYPE (TIC?) 

OOTPOT = PRCTOTYFE (LIST) 

OUTPUT = PPOTOTYPE (NBGARR) 

will cause the strings 

1:8,1:8 
1:5,1:5, 1:3 
0:9C9 
-50:5 

to te printed. Such strings ^ay be investigated with a 
pattern-matching rule to determine the structure of the 
array; this may be useful in cases where the dimensions have 
not been given as literals within the ARRAY () procedure's 
argument, but have been specified by more complicated 
expressions or supplied from the data. For example, an array 
could be created by executing the rule 

BOXES = AFRAY(DT?11 ',' riH2) 

Although the value of BOXES appears to be a two-dimensional 



7A. Arrays 111 



array, this is not necessarily the case since the values of 
Dim and Diri2, perhaps acquired from the input file, may 
contain any number of commas, each indicating another 
dimension. The number of dimensions of this array may bo 
determined by the following simple program segment which 
searches the string returned by PROTOTYPE () to determine how 
cany commas it contains; the number of dimensions is always 
one more than the number of commas, 

STRING = PROTOTYPE (P0XK2) 
LOOP STRING BREAK (•,«) •,• REM , STRING : F (DONE) 

COr^flA = COMMA ♦ 1 : (LOOP) 

DONE DTMENS = COWfIA +1 

The PROTOTYPE procedure may also take a pattern or a 
Naire or a structure of programmer-defined datatype as its 
argument. A description of the use of PROTOTYPE () with an 
argument of one of these datatypes may be found in Appendix 
A, section II. B. 

Ill£_IIP|iii_Pl2£edure^ The TYPE() procedure is one vhich 
will accept any expression as its single argument. If tho 
value of its argument is of a predefined datatyp-^, thr. 
procedure returns as its value a string specifying that 
datatype; if the value is of a programmer-defined datatype, 
the string DATA is returned. For example, execution of the 
rule 

OUTPUT = TYPE(«SAS5AFRASM 

will print STRING while execution of the rule 

OUTPUT = TYPE(ARB) ' 

(if ARE still has its predefined value) will produce 
PATTERN; the rule 

OUTPUT = TYPE (LIST) 'nann' TY PE (LIST[ 1 ]) 

will print ARRAY followed by INTEGER. 

TYPE() is often used to test whether or not some 
variable has a value of the expected datatype before some 
process is allowed to continue. It is particularly useful 
for testing whether values passed to the formal variahl<>s of 
a procrduro ai:o of t h(:> correct d.itat/po, ,ind for insuriiKi 
that all values assigned to OUTPUT are of datatype 'String or 
datatype Integer. 



7A. arrays ■'^2 



The short loop presented earlier to print the values of 
all items belonging to a specified array may be amended with 
the use of the TYPEO procedure to first test the datatype 
of each value and then to print only those of datatype 
String or Integer. This amended program segment uses 
indirect referencing within the go-to to transfer to a label 
representing the type cf the value being processed. Tf the 
value is of datatype 5;tring or Integer then the value is 
printed; if it is of any other datatype, a message regarding 
its type Is printed. In either case, the value of the 
selector is printed first so that the particular item whose 
value is being printed or described may be identified. The 
PROTOTYPE {) procedure is used in the first statement to 
insure that a one-dimensional array is being processed, and 
to determine the lower bound of this array. 

* TEST WHETHER ARRAY IS 1-DIMENSIONAL AND FIND LOWER BOUND 

PROTOTYPE (LIST) BREAK(';») . N •:» 
+ SPAN{»-0123U56789«) RPOS (0) : F(ERROR) 

* 

* LOOP TO PRINT ALL VALUES WHICH ARE STRINGS 

* IF LIST[N] EXISTS, GO TO THF STAT;^MENT LABELLED BY THB 

* TYPE OF ITS VALUE 
* 

LOOP LIST[N] : F(DONE) S ($TYPE (LIST[ N ]» ) 

* 

STRING 

INTEGER OUTPUT = N 'nn' LTST[N1 : (INC) 

REAL 

PATTERN 

ARRAY 

NAKE 

CODE 

DATA OUTPUT = N «nnTHISaITEnnISnOFnTyPEn» TYPE (LIST[ N ]) 

* 

* INCREMENT INDEX TO GET NEXT ITEM 

INC N = N ♦ 1 : (LOOP) 

The labels provided in the program text (with the 
exception of LOOP and INC) are exactly the strings returned 
by the TYPE() procedure. All have been mentioned except 
CODE, which is described briefly in Appendix A, section 
II. C. These labels provide an exhaustive list of the string 
values which TYPE() can return. 

The program text may appear strange because of the 
number of null rules. Since the statements labelled STRING 
and INTEGER both need the same rule, it has been written 
only once in the second of these statements, the one 
labelled INTEGER. If control is sent to the statement 



7A. Arrays 1^3 



labelled 55TRTNG, it is sent on ira!P,eaiatGly to the statement 
labelled INTVIGRR where the rule which calls for printing is 
executed, since the statement labelled STRING has no rule 
and no go-to to be processed. Similarly, since the 
statements labelled REAL, PATTERN, ARRAY, NA^E, CODE, and 
CATA all need the same rule, it is written only once in the 
last of these statements, the one labelled DATA. 

The evaluation rule LIST[N] is needed in order for 

failure of the item reference to be detected. If this 

evaluation rule were omitted and the statement consisted 
solely of the go-to 

: ($TYPR(LIST[N]) ) 



r 



then there would be no way to terminate the loop gracefully, 
and an execution-time error would result when the item 
reference failed within the go-to because the value of N 
became too large. 

PloSI!<!l>I£_i2_iflii!rJ!_a_f2lS£i.2Ii There are a number of 
processes concerning arrays which~it would be convenient to 
express as programnor-def ined procedures since they are so 
frequently needed. For example, one often wants to know the 
selector associated with the first null-valued item of an 
array so that this item may be given another value. Thn 
following SELFCT() procedure fails if there are no null- 
valued items, or succeeds and returns the selector of th<^ 
first null item as Its value. Tt works for any one- 
dimensional array, and uses PR0T0TYPE() as before to test 
that the array is one-dimensional and to find its lower 
bound. The single argument of SELECT () may be any expression 
whose value is an array. 

DEFINE {'SELECT (APR1)N«,' PR. SEL') : (END. SELECT) 

* TEST WHETHER FIRST ARGUMENT HAS AN ARRAY AS ITS VALUE 

PR.SEL IDENT (TYPE (ARR1) , 'ARRAY') : F(SEL.ERI) 

* 

* TEST WHETHER ARRAY IS 1-DIMENSICN AL AND FIND LOWER BOUND 

PROTOTYPE (ARR1) BREAK(':') . N ':« 

* SPAN{'-0123U56789') RPOS (0) : F(SEL.ER2) 

* TEST WHETHER THIS ITT"1 HAS A NULL VALUE 

* RETURN ITS SELECTCR IF TT DOES 

OUT.SFL SELECT - IDE NT {A RR 1[ N ]) N : S(RETnRN) 

* ELSE INCREMENT INDEX TO LOOK AT THE NEXT ITEM 

N = N f 1 



7A. Arrays ^ '' ** 



♦ TEST WHETHER THIS SELECTOR IS OUTSIDE THE BOUNDS OF ARRAY 

♦ IF SO, THIS ARRAY CONTAINS NO NHLL-VALUED ITEMS 

ARR1[N] : F(FRETURN) S(OnT.SEL) 

* 

♦ PRINT ERROR MESSAGES AND STCP 

SEL.ER1 OUTPUT = • ARGUMENTaOFnSELECT () nNOTnANnARR AY « 

♦ : (END) 
SE1.EP2 OUTPUT = • ARRAYnPASSEDaISaN0Ta1-DI«ENSI0N AL' 

♦ : (END) 
END. SELECT 

When this procedure is used, as in the statements 

Q = SELECT (LIST) : F (FULL) 

LIST[Q1 = WORD 

or, equivalently, 

LIST[ SELECT (LIST) ] = WORD : F(FULL) 

the procedure reference SELECT (LIST) causes the value of the 
variable LIST to be assigned as the value of the formal 
variable ARRI internal to the procedure call. If the value 
of LIST is an array, as is intended, this means that the two 
variables LIST and ARRI have the same array as thair values. 
The first staten-ent of the procedure body tests the value of 
ARRI to insure that it is indeed of datatype Array before 
proceedinq; the second statement further tests that this 
array is one-dimensional. If either test fails, an 
appropriate error message is written and the procedure ends 
execution of the program. If ARRI has as value a one- 
dimensional array, then the lower bound of this array is 
assigned to the internal variable N. Then the evaluation 
rule ARR1[N1 is executed; this refers to the same array item 
as IIST[N] since ARR1 and LIST both have the same array as 
value. This rule fails only when the value of N exceeds the 
upper bound of the array, which occurs only when all items 
of the array have already been considered. Hence if the rule 
fails the array contains no null-valued items and an FRETURN 
is taken. If the rule ARR1[N] does not fail then the value 
of ARR1[N] is tested to see whether or not it is null; if it 
is null then the result variable SELECT is assigned the 
value of N so that this value is returned as the value of 
the procedure call. 

Procedure_to_Interchanae_lwo Irra^s^ There are some 

procedures which need~to be passed the name of the variable 
whose value is an array, rather than the array which is the 
value of that variable. Consider two variables named X and 
Y; the value of X is a one-diaiensicnal array of 10 items. 



7A, Arrays 



while the value of Y is a one-dimensional array of 100 
items. The programmer wishes to cause the value of X to be 
the lOO-itom array, and the value of Y to be the 10-itPin 
array. Before perforrainq this swap he wants to be sure that 
X and Y are both one-niraensicndl arrays. This process may b^ 
perCorioed with the side-effect procedure SWAP() which has 
three arcjuments: the names of the two variables whose values 
are arrays, and the number of dimensions these arrays are 
both to have. Each name is presented as a string which will 
be passed to the procedure body to be used as the operand of 
the indirect referencing operator to return a variable; the 
number of dimensions may be expressed as any numeric- valued 
expression. The SWAP () procedure uses the REPEATO 
procedure, described at the beginning of Chapter 6, to build 
a pattern which can be used to determine whether or not tho 
prototype of each array has the specified number of 
dimensions. 

DEFINE ('SWAP (A,D,N) P ATI , PAT2, TEMP* , 'PR. SWAP') 

> : (END. SWAP) 

* 

* TEST WHETHER THE FIRST TWO Ai^GUHENTS ARE ARRAY-VALUED 
PR. SWAP IDKNT (TYPE (JA) ,' ARRAY') : F(SWAP.EPI) 

IDENT(TYPE($D) , 'ARRAY') : F(SWAP.ER2) 

* 

* TEST WHET!fER BOTH ARRAYS ARE OF THE SPECIFIED DIMENSION 

* BUIIP A PATTERN USING REPEAT () TO LOOK FOR THE FIGHT 

* NUMBER OF COLONS WITHIN THE PROTOTYPE 

PAT1 = BREAK (': ') ': • 

PAT2 = POS(O) REPEAT (PAT1,N) 

* SPAN (•-0123a56789') RPOS (0) 
PROTOTYPE ($A) PAT2 : F(SWAP.EP3) 
PROTOTYPE ($B) PAT2 : F{SWAP.EPiJ) 

* EOTH ARE ARRAYS OF THE SPECIFIED DIMENSION 

* SWAP THEM AND RETURN 

TEMP = $A 

JA = SB 

$B = TEMP : (RETURN) 

* 



PRINT ERROR MESSAGES AND PAIL 

SWAP.ER1 OUTPUT = ' FIRSTnARGUMENTnCFaSW AP () nNOTn ANnA RR AY' 

♦ : (FRETURN) 
SWAF.ER? OUTPUT ^ ' SECONDriARG UMENTdOFmSWAP () nNOTnA Nii ARRAY • 

♦ : (FRETURN) 
SWAP.ER1 OUTPUT ^ ' E TRf^TiiARR A YriNOTnOFrjUTMENSI ONn • }i 

♦ : (FRETURN) 
SWAP.ERU OUTPUT = ' SECONDn ARR A YnNOTnOFriDI MENSTONn ' N 

♦ : (FRETURN) 
END. SWAP 



7A. Arrays ""^^ 

K call on this crocedare to do the swapping of the 
values of X and Y as described above could have the form 

SMftP(«X«,'Y«,1) : F(ERROR) 

Since the formal variables A and B never appear within 
the procedure body except preceded by a $ operator, it would 
seen at first that the call SWAP(X,Y,1) could be used 
instead of the call SWAP (» X' , • Y' , 1) and all the indirect 
referencing operators removed from the procedure body, since 
the expression $»X' is indeed equivalent to X in all cases. 
If this were done, however, the value of X would be used 
wherever the formal variable A occurred in the procedure 
body. While the expressions TYPE (A) and PROTOTYPE(A) , where 
A has as its value the same array that is the value of X, 
will indeed work as desired, rules of the form A = B and 
B = TEMP, will not produce the desired effect. Execution of 
the rule A = B would cause the formal variable A to be 
assigned the array which is the value of Y, and the rule 
B = TEMP would cause the foriaal variable D to be assigned 
the array which is the value of X. Thus the values of A and 
B, which are internal to the procedure call only, would be 
swapped rather than the values of the external variables X 
and Y. In order to change the value of X, the string which 
is its name must be passed and a rule of the form $A = SB 
must be used, since the expression $A, in this case, will 
return the external variable X to which an assignment can 
then be made. 

The Name OEerator^ Since array items do not have 

strings as"names, problems arise when one tries to pass the 
name of an array item to a procedure. If the 100-item array 
described above had been assigned to the created variable 
LIST[1] instead of to the natural variable Y, and its value 
was to be swapped with that of the 10-item array which is 
the value of X, then a call of the form 

SWAP(»X«, •LIST[11«,1) 

would not produce the desired effect since the string 
LIST[ 1 ] is the name of a natural variable, and thus cannot 
be the name of a created variable. 

The problem of passing the name of a created variable 
is solved with the use of the name operator, a unary 
operator whose symbol is a period. This operator takes any 
variable as its operand and returns as its value a special 
object of datatype Name which is a name for that variable. 
Thus the name of the created variable LIST[1] may be 
represented as .LIST[1], so a procedure call of the form 



7A. Arrays 117 

SWAP(»X»,.LIST[1 ],1) 

would produce the desired effect. 

If the operand of the name operator is a natural 
variable, which thus has a string name like X for example, 
then the Name .X provides still a different name by which to 

rp>ffir fn <-ha4- wafiahlia TKq *■«<-> namrtc- n1i.ic»»fc- t-^<^^»- *■ n *■ U ^ 

same variable, and can be used interchanqeably. The 
application of the $ operator to an operand of datatype Name 
gives the same effect as its application to a string-valued 
operand: the variable named by the operand is returned. Thus 
the call 

SWAP{.X,.LIST[ 1],1) 

could be used as well. The only necessity for the use of the 
name operator arises when names of created variables must he 
passed to and from procedures. Note that objects of datatype 
Name cannot be printed. 

As an example of an application in which a Name is to 
be returned by a procedure, consider an amended version of 
the SELECT () procedure, presented earlier in this chant or, 
which would return the Name of the first null-valued item of 
an array rather than its selector. This amended procedurp, 
called STEP(), is presented below; the entire procedure body 
is the same as that of SELECT () except for the s tato.'in^n i- 
labelled OFJT.STEP in which the result variable is assigned a 
value of datatype Name. 

* FFOCEDIIRE TO RETURN NAME OF FIRST NHLL-VALUED ITEM 
* 

DEFINE ('STEP (ARRI) N« , 'PR. STEP') : (END. STEP) 

* 

* TEST WHETHER FIRST ARGfinENT HAS AN ARRAY AS ITS VALITF 
PR. STEP IDENT(TYPE(ARR1) , 'ARRAY') : F(STEP.ERI) 

* 

* TEST WHETHER ARRAY IS l-DIMENSICN AL AND FIND LOWER BOUND 

PROTOTYPE (ARR1) BREAK(':») . N •:• 
+ SPAN(»-0123456789') RPOS (0) : F(STEP.ER2) 

* TEST WHETHER THIS ITFM HAS A NUIL VALUE 

* RFTORN THE NAME OF THIS ITEM IF IT DOES 

OUT. STEP STEP = TDENT (APR 1[ N ], NULL) .ARR1[N] : S(nETURN) 

* ELSE INCREMENT INDEX TO LOOK AT NEXT ITEM 

N = N ♦ 1 



7A. Arrays 1 1 8 



♦ TEST WHETHER THIS SELECTOR IS OUTSIDE THE BOUNDS OF ARRAY 

♦ ir so, THIS ARRAY CONTAINS NO NULL-VALUED ITEMS 

ARR1[N] : F(FRETURN) S (OUT. STEP) 

* 

♦ PRIWT ERROR WS<^SAGES AND STOP 

STEF.EE1 OUTPUT = • ARGOWENTnOFaFl'ND () nNOTnANa ARRAY' : (END) 
STEP.ER2 OUTPUT = • ARR AYnP ASSFDal SaNOTol-DI MENSIONA L' : (END) 
END, STEP 

The rule 

$STEP(LIST) = WORD : F (FULL) 

may be used to assign the value of WORD to the first null- 
valued item of the array which is the value of LIST. 
Execution will cease if the value of LIST is not a one- 
dimensional array (in which case an error message is 
printed). The procedure call will fail if there are no null- 
valued iteffis remaining within the array. If the procedure 
call succeeds it returns the Name of the first null-valued 
itens; this Name is used as the operand of the $ operator 
which returns the needed variable. 

Alternatively, an NRETURN could be used to cause the 
procedure to return a variable rather than an object of 
datatype Name, but the name operator would still be needed 
within the procedure body. If the statement labelled 
COT. STEP were written as 

OUT. STEP STEP = IDENT (ARR 1[ N ], NULL) .ARR1[N] : S (NRETURN) 

then the procedure call would have the form 

STEP (LIST) = WORD : F(FnLL) 

since the value returned by STEP() is the variable needed 
for assignment. 

Z2Eiilia-.aill_Selectors_of_an_Arra_2_^ Whenever the STEP() 
procedure is called, it always starts by investigating the 
"first" item of a one-dimensional array, that is, the one 
whose selector is formed by using the lower bound of the 
array as its single index. The procedure continues to form 
new selectors by adding one to the value of this index until 
a null value is found, or until an attempt is made to 
increase the index beyond the upper bound of the array; if 
this happens, then every selector of the array has been 
used. Since the STEP () procedure has been written to process 
one-dimensional arrays only, the method it uses for 
determining all selectors of an array is very simple. The 



7A. Arrays IIP 



process of determining all selectors becomes more 
complicated when an array is multi-riimensional. 

A general purpose method which would work for an array 
of any number of dimensions could he described as follows. 
Start with a selector formed by using the lower bound of 
each dimension as its index; this information may he 
obtained from the prototype of the array, (For example, the 
initial selector of an array whose prototype is 
0:2,1:10,1:10 is 0,1,1.) Subsequent selectors are formed hy 
adding one to the index of the last (rightmost) dimension 
until the upper bound for that dimension is reached (just as 
for a one-dimensional array), while keeping all other 
indices constant. When the upper bound of the last index is 
reached, reset that index to its lower bound and increment 
the index of the penultimate dimension by one. ''or this 
value of the next-to-the-last index, run throuah all values 
of the last index again, resetting when the upper bound is 
reached. Repeat this process for all values of th^ 
penultimate dimension, then reset the this index to its 
lower bound and tegin incrementing the index: of thr- 
antipenultimate dimension, repeating the previously 
described processes for each of its values, etc. Proceer^ 
until the index of the first dimension has reached its upper 
bound; then, all selectors of the array have been forraert. 

If the process just described is applied to a three- 
dimensional array whose prototype is 1:3,1:2,1:2, thn 
following selectors will be formed in the indicated 
"nuireric" order. 

(1.) 1,1,1 (5.) 2,1,1 ( 9.) 3, 1,1 

(?.) 1#1r2 (6.) 2,1,2 (10.) 3,1,2 

(3.) 1,2,1 (7.) 2,2,1 (11.) 3,2,1 

(*♦.) 1*2,2 (8.) 2,2,2 (12.) 3,2,2 

It is easily seen from this display that the rightmost 
index does indeed vary most often, while the leftmost index 
is never reset but goes through its range of values only 
oncG. The process could be described just as easily with the 
leftmost index varying most often, but the order in which 
the particular selectors are formed is immaterial since t\\^ 
same process may be used whenever all items of an array are 
to be considered. Thus if all items are assigned values hy 
the method just described and later the same method is used 
to print the values, then the values vrill he printed in 
whatever order thoy were assigned. Since there rire many 
applications in which all items of an array must be 
considered, it is convenient to express this process in 
terms of a procedure. 



7A. Arrays 120 



Procp(!ure to gGturn the "Nert^ S£i§£tor_^ P resented 

below is a programmer-defined procedure, NFXT(), which 
requires two strings as arguments: the first represents a 
current selector and the second the prototype of the array 
whose "next" selector is to be formed; this selector is 
returned in the form of a string as the value of the N'ilXT () 
procedure. Here "next" is used to mean the selector which 
follows in the order described in the preceding section. The 
NEXT procedure fails when there is no next selector, for 
exairple, when the current selector passed as its argument is 
the last in the order described above. 

* FFOCEDHRE TO RETURN THE "NEXT" SELECTOR 
* 

DEFINE ('NEXT (SEL, PHOTO) INDEX, LB, UB« ,• PR. NEXT") 
* 

* PATTERN FOR TEARING SELECTOR APART INTO ITS INDICES 

* ASSIGN THIS PATTERN TO THE HAIN-PROGRAM VARIABLE SEL.PAT 

SEL.PAT = {',• 1 NOLL) SPAN (• -01 23U5678 9« ) . INDEX 

* REGS (0) 

* 

* PAT1E?N FOR TEARING PROTOTYPE APART TO FIND LOWER AND 

* UPPER BOUNDS 

* ASSIGN THIS PATTERN TO THE MAIN-PROGRAM VARIABLE PROT.PAT 

PROT.PAT = (•<,' ] NULL) SPAN ( •-012:?U56789 ») . LB 

* »:' SPAN(«-0123U56789') . UB RPOS (0) : {END. NEXT) 
* 

* FIND RIGHTMOST INDEX OF THE SELECTOR STRING AND REMOVE 

* FAIL IF NO MORE INDICES TO BE FOUND 

PR. NEXT SEL SEL.PAT = NULL : F (FRETORN) 

* 

* FIND LOWER 8 UPPER BOUNDS FOR THIS DIMENSION 

PROTO PROT.PAT = NULL 
* 

* INCREMENT INDEX IF IT IS LESS THAN THE UPPER BOUND 

INDEX = LT (INDEX, UB) INDEX ♦ 1 : F (RESET. NEXT) 
* 

* FORM NEXT SELECTOR STRING BY CONCATENATION 

NEXT = IDENT (SEL, NULL) INDEX ',• NEXT : S (RET. NEXT) 
NEXT = SEL •,' INDEX •,• NEXT 
* 

* REMOVE SPURIOUS FINAL COMMA FROM SELECTOR STRING 
RET. NEXT NEXT •,• RPOS (0) = NULL : (RETURN) 

* 

* RESET THIS INDEX TO. ITS LOWER BOUND, CONCATENATE IT TO 

* THE SELECTOR STRING BEING FORMED AND PROCEED TO WORK 

* ON THE NEXT INDEX 
RESETS NEXT 

NEXT = LB ',' NEXT : (PR. NEXT) 

END. NEXT 



7A. Arrays 121 



Note that the NEXT () prncedure returns a string as its 
value. Thus the selector represented by that string cannot 
be used within an item reference, where only a selector list 
is appropriate, but may be used as the second argument of 
the ITEM () procedure, as in the rule 

OUTPHT = ITFM (LIST, NEXT (SELECT, PROTOTYPR(LIST) ) ) 

where the value of SELECT is a string representing the last- 
used selector. If the ITEM {) procedure were not defined to 
accept a string as its second argument, it would not he 
possible to write a useful, general purpose NEXT() procedure 
to work on an array with any number of dimensions. 

NEXTO was devised for the purpose of returning all 
successive selectors of an array, each call to NEXT () 
returning the next selector until a failure transfer is 
executed. The loop shown below uses the NKXT() procedure in 
this way. The INIT() procedure which precedes the loop 
provides a string to he used as the initial value of SELECT; 
INIT() takes a prototype as its argument and return.'; the 
"first" selector of an array described by that prototype. 

DEFINE (•INIT(PROTO)IBPRT,LR«, 'PR. I NTT') 

DC 

* SET np PATTERN TO EIND LOWER BOIJND EOR EACH DIMEN'^ION 

* ASSIGN THIS PATTERN TO THE MAIN-PROGRAM VARIABLE LB. PAT 

LBPAT = BREAK(»:') . IB •:• (EREAK(«,') «,» | PEM) 

* : (END.INIT) 

* 

* USE THIS PATTERN TC FIND NEXT LOWER BOUND 
PR.INIT PROTO LB. PAT = NULL : F(RET.INIT) 

* 

* FORM INITIAL SELECTOR STRING BY CONCATENATION 

INIT = INIT ',' LB : (PR.INIT) 

* REMOVE SPURIOUS INITIAL COMMA AND RETURN 

RET. INIT INIT ',' = NULL : (RETURN) 

END.INIT 

* LOOP TO PRINT ALL SELECTORS OF IIST 

SELECT = INIT (PROTOTYPE (LIST)) 

LOOP OUTPUT = ITEf" (LIST, SELECT) 

SELECT = NEXT(SELECT, PROTOTYPE (LIST)) 

* : S(LOOP) 

Since NEXT is meant to be used in this and similar 
ways, it has no special provision for dealing with selector 
strings passed as the first argument which fall outside tho 
range of the array; such provisions could bp added to make 
the procedure mere generally useful. 



7A. Arrays ^^^ 



PX5££^ili::S_i3_B.Sii2II2_5L-£2£:Y_2l_ailI_5lX3.Xi I"^ ^^ often 
necessary"' to make ~a~ copy of an array, rather than merely 
assigning the same array as the value of more than one 
variable, so that changes in the values of the copy can be 
made without affecting the original. To make a copy of an 
array means to create a new array ^lith the same prototype as 
that of the original, and to assign to each of its items the 
same value as that of the corresponding item in the original 
array. The following CCPY{) procedure returns as its value a 
copy^ of any array; it requires only one argument, which may 
be any expression whose value is the array to be copier! 
this array may have any number of dimensions. The cnPY{) 
procedure invokes the INIT() procedure to form the initial 
selector string, and the NEXT () procedure to insure that all 
items are considered and hence copied; both of these 
procedures are described in the preceding section. A call to 
the COPY() procedure fails, causing an error message to be 
printed, only if its argument is not of datatype Array. 

* FPOCEDUP.E TO PETORN A COPY OF ANY ARRAY 
* 

DEFINE ('COPY (AER1) SELECT, P', 'PR. COPY') : (2ND. COPY) 

* TEST WKTITHER ARGOMENT IS AN ARRAY 

PR. COPY IDFNT (TYPE ( AFR1) ,' ARRAY') : F(COPY.ERl) 
* 

* CREATE A NEW ARRAY WITH PROTQi^YPE OF ARGUWENT 

* AND ASSIGN IT AS THE VALDE OF THE RESULT VARIABLE 

P = PROTOTYPE (ARR1) 

COPY = ARRAY (P) 
* 

* CALL TNTTO TO RETURN THE FIRST SELECTOR OF THIS ARRAY 

SELECT = INIT(P) 
* 

* COPY VALnE OF NEXT ITEM OF ARRAY, OSING ITEH() 
COPY. COPY 

* TTEW (COPY, SELECT) = ITEM (AR R1 , SELECT) 

* 

* CALL SEXTO TO RETURN THE NEXT SELECTOR OF THIS ARRAY 

* IF NO NEXT SELECTOR, RETURN 

SELECT = NEXT (SELECT,?) : S (COPY. COPY) 

4. F{RETURN) 

COPY.ERI OUTPUT = • APGUHENTaOFaCOPYnNOTnANnA RRAY • 

4. : (FRETURN) 

END. COPY 



123 



Appendix A. SUMMARY OF PREDEFINED PR0CEDUP5S 



I. PRCGRAM PROCRDnRES are used by the programmer as basic 
operations in ccn::;triicting programs. 



* • 3SSi_£E22ed ures 

1 . General Comparison 

IDENTO 
DIFFER 

2. String Comparison 
LGT{) 

3. Arithmetic Comparison 

EOO 
NF{) 
GTO 
GEO 
LT{) 
LF() 

E. Result Procedgrgs 

1. Pattern Construction 

ANY() 

NCTANYO 

SPAN() 

BRFAKO 

LEN{) 

TA0() 

RTABO 

PCS() 

RPOS() 

ARDNOO 

2. String Operation 
TRIMO 



R. Summary of Predefined Procedures 124 

C . Da ta_P r oced u re s 

1. Structure Creation 
AERATO 

2. Field Selection 

PAPAM 

FTPSTO 

RESTO 

LEFTO 

RIGHTO 

FAflLYO 

SELECTORO 



IT, SYSTEM PROCEDORES are used to communicate instruction; 
and requests to the Snotol system. 

A . Declarations 

1. Programmer-defined Procedures 
DEFINE 

2. Programmer-defined Datatypes 
DATAO 

B • Access to__5ygtem_Inf.ormaticn 

1. Attributes of Objects 

ST2E() 
DATATYPEO 
TYPEO 
PROTOTYPE () 

2. Execution Information 

ALPHABETO 

DATEO 

CLOCK () 

TIME{) 

STCOUNT{) 

STLiaiTO 



A, Summary of Predefined Procedures 12S 

MAXLNGTHO 
FNCLEVELO 
NFXTVAR 

^ • R Q<T »GSts for Sjstem_Act_igns 

1. Special Execution 

APPLY 
IF() 

2. Set Mode of Pattern-Matching 
ANCHOR 

3. Datatype Conversion 

CCNVEPTO 
CODE() 

0« I npt' t/Cutput_ Procedures 

1. File Association 

INPUT 
OUTPUT 
DETACH 

2. Requests for File Actions 

ENDGROUPO 
REWIND 
REMARK 
FREEZE {) 

3. Tests of File Position 

EORLEVELO 
EOI 



A. Sumraary of Predefined Procedures 126 



The foregoing classification schewe is introduced as an 
aid to understanding the purpose and use of the various 
predefined procedures; the particular classes differentiated 
play no part in the definition of 5^nobol, and other 
classifications could be devised. Notice that most 
programmer-defined procedures declared by DEFINE () 
constitute extensions of the classes of test procedures and 
result procedures, and that those declared by DATA () 
constitute extensions of the classes of structure creation 
and field selection procedures. 

In the descriptions which follow, each predefined 
procedure is shewn along with the kind of value required for 
its argument (s) and the kind of v^lue it returns. There are 
no sy'ntactic restrictions on the form of arguments; since 
all arguments are passed "by value" in Snobol procedure 
calls, actual arguments may be written as arbitrarily- 
complicated expressions. There are, however, semantic 
restrictions on the values resulting from evaluation of 
actual arguments, defined in terms of "datatypes. " Every 
data object known to a Snobol program is of datatype String, 
Integer, Pattern, Real, Array, Name, Cone, or a programmer- 
defined datatype. Each procedure is shown here with the 
datatypes it will accept; a call of a procedure using an 
argument with a wrong datatype will result in an execution- 
time error. Some procedures are described as accepting the 
non-datatype "structure"; these procedures will accept an 
argument of any programmer-defined datatype. Sotae procedures 
are described as accepting the non-datatype "any"; these 
procedures impose no restrictions en their arguments. Some 
procedures are described with an empty argument list; these 
procedures are defined to have no arguments. 

There are two generalir-ations not specifically 
mentioned in the descriptions: (1) a procedure which accepts 
a Pattern will accept a string or an Integer; (2) a 
procedure which accepts a String will accept an Integer. 

Any predefined procedure may he called with more or 
fewer arguments than are shewn in its definition. Missing 
arguments are assumed to be the null value; extra arguments 
are evaluated but otherwise ignored. The evaluation of extra 
arguments may have important consequences, however; if the 
evaluation involves the invocation of procedures which 
produce side effects, for example, it will cause those side- 
effects to occur before the outer procedure call occurs, and 
failure during any part of the evaluation of the arguments 
will result in failure of the rule before the procedure call 
occurs. The extra arguments are ignored only in the sense 
that they are not passed to the procedure being called. 



A. Summary of Predefined Procedures 



127 



I. PROGRAM PROCEDURES 



I. A 



Test Procedures 



IDENT (any, any) 
DIFFER (any, any) 



Returns: null value, or fails 
Returns: null value, or fails 



IDENT () and DIFFER () ate used to ccmparo two rtrqumentr. 
of any datatype to see if they are indistinguishable to tho 
Snohol syst€<m — equivalent pattern structures, the same 
array, equal integers, identical character strings, or 
whatever. IDENT () succeeds if its arguments are identical; 
DIFFER succeeds if its arguments are not identical. 



IDENT (PRU. PAT, TEST. P^T) 



DIFFER (WORD, NULL) 



LGT (St ring, St ring) 



Returns: null value, or fails 



LGT — a mnemonic for Lexicographically Greater Than 
— compares two strings to see if they are "alphali^tically " 
ordered, using as an alphabet the computer's character sot 
in its standard collating sequence. (»iotice that the 
arguments must be given in the reverse of the desired ordor; 
the test is whether the first argument Colj_ows the second 
argument.) ~ 



LGT(WORD, 'LEMUEL') 



LGT (WORD, TEST) 



RQ (Integer, Integer) 
EQ (Real, Real) 

NE (Integer, Integer) 
NE (Real, Real) 

GT (Integer, Integer) 
GT (Real, Real) 

GE (Integer, Integer) 
GE (Real, Real) 

LT (Integer, Integer) 
LT (Real, Real) 

LE (Integer, Integer) 
LE (Real, Real) 



Returns: null value, or fails 

Returns: null value, or fails 

Returns: null value, or fails 

Returns: null value, or fails 

Returns: null value, or fails 

Returns: null value, or fails 

Returns: null value, or fails 

Returns: null value, or fails 

Returns: null value, or fails 

Returns: null value, or fails 

Returns: null value, or fails 

Returns: null value, or fails 



A. Sununary of Predefined Procedures 12R 



These arithi.etic test procedures are used to compare 
the first argument to the second argument to see if the 
relationship symbolized by the procedure name is true. The 
two arguments must be of the same datatype. 

F.Q(ACNT,BCNT) ; LT(LINE,5) 

X = LE(X,8) X ♦ 1 : F{OUT) 



I • B Re su 1 1 _Pr oced u re s 

ANY (String) Returns: Pattern 

ANY {) returns a pattern which will match any single 
character from its argument string. 

AHYCAEIOU') ; AKY (VOWELS) 

NOTANY (String) Returns: Pattern 

NOTANYO returns a pattern which will match any single 
character not appearing in its argument string. 

NOTANY('AEIOO') ; NOTARY (VOWELS) 

SPAN (String) Returns: Pattern 

SPANO returns a pattern which will match the longest 
continuous string of one or more characters appearing in its 
argument string. 

SPAN(«AETCn«) ; SPAK(VOWELS) ; SPAN (• MISSISSIPPI' ) 

BREAK (String) Returns: Pattern 

BREAK returns a pattern which will match the longest 
continuous strina of none or more characters not appearing 
in its argument string; that is, everything up to but not 
including any character in its argument. 

BREAK(«AEIOU») ; BREAK (VOWELS) ; BREAK (• MISSI SSIPPI') 



A. Summary of Predefined Procedures 12^ 



LEN (Integer) Returns: Pattern 

LFN returns a pattern which will match any string of 
characters of the length given by its argument. 

LEN(5) ; LEN{'22«) ; LEN (SIZE (VOWELJ^) ) 



TAB (Integer) Returns: Pattern 

TAB() returns a pattern which will match all th^ 
characters up to the string position specified by itn 
argument. (The convention for string numbering is that 
string position precedes the first character, string 
position 1 is after the first character, and string position 
n is after the n-th character.) 

TAB(5) ; T!\B('22') ; TAB (COUNT) 



RTAE (Integer) Returns: Pattern 

RTAB() returns a pattern which will match all the 
characters up to the string position specified by its 
argument. Its action is identical to TAB(), matchina strinar; 
of characters from left to right; the only differenco 
between them is the numbering convention used by th" 
argument. (RTAR{)'s numbering convention is thit strinn 
position is after the last character, strina position 1 is 
before the last character, and string position n is before 
the n-th character from the end of the string.) 

RTAB(5) ; RTAB('22') '; RTAB(O) 



POS (Integer) " Returns: Pattern 

PCS returns a pattern which will match only tho 
string position specified by its argument; it matches no 
characters at all. (String positions follow the numbering 
convention of TABO.) 

PCS (0) ; POS(S) ; POS ('22') 



A. Nummary of Predefined Procedures T^O 



RPOS (Integer) Returns: Pattern 

RPOSO returns a pattern which will match only the 
string position specified by its argument; it matches no 
characters at all. (String positions follov the nunbcring 
convention of PTRB().) 

PP0S(5) ; RPOS(»22') ; RPOS(COnNT) 



RRBN'C (Pattern) Returns: Pattern 

ARBNOO returns a pattern which will match zero or more 
occurrences of the pattern which is its argument. 

ABBNO(BREAK('n. ,;•) LEN(1)) *. ARBNO (ANY ( • AEIOU ' ) ) 



TRIM (String) Returns: String 

TRIHO returns a string which is the same as its 

arguirent, but shorn of trailing blanks. 

TRIM(WORD) ; TRIM(INPOT) ; TRIM (UNCLE. TOBY) 



I.C Cata, Procgdures 

ARRAY (String) Returns: Array 

ARRAY accepts as its single argument a prototype 
string specifying the number of dimensions wanted and the 
upper and lower bounds for the index of each dimension. 
ARRAY (MO, 15«) specifies a two-dimensional array with 
indices from one to tan and one to fifteen. 
ARRAY ('0:60, -5: + 5') specifies a two-dimensional array with 
indices from zero to sixty and froir minus five to plus five 
(i.e., a sixty-one by eleven item array). All array items 
are initialized to the null value. There is no limit on the 
number of dimensions which may be specified for an array. 

Since ARRAY {) returns an object of datatype Array as 
its value, it is used by writing something like 

LIST = ARRAY('0:60M 
which has the effect of creating a family of sixty-one 



A. Summary of Predefined Procedures 131 



variables, which may then he referred to by the item 
references I.IST[01, LIST[ 1 ],. . . ,LIST[ 60 ]. 



PAEAM (Pattern) Returns: Pattern, String, or Integer 

PARA«() accepts as its argument only a pattern returned 
by one of the ten predefined pattern procedures; it returns 
the argument (parameter) with which one of those was called 
to construct the pattern. If the pattern is one constructed 
by LEN()» POS(), RPOSO, TAB{), or RTAB () , then PARA!1 {) 
returns an integer; if the pattern was constructed by ANY() , 
NOTANYO, SPANO, or BREAK (), then PARAM() returns a string 
of characters in their standard collating sequence (th-^ 
sequence defined by ALPHABET () ). If the pattern vras 
constructed by ARBNO(), then PARAw() returns the pattern 
that was its argument, which may of course be of datatype* 
String or Integer in simple cases. 



FIRST (Pattern) Returns: Pattern 

FIRST () accepts as an argument a pattern constructed by 
an alternation or concatenation operator. It returns tho 
first element of the pattern. Thus if 

PAT = X Y I Z 

has been executed, then 

FIRST (PAT) 

returns the pattern which is the value of the expression 
X Y, a concatenation. On the other hand, if 

PAT = X (Y I Z) 
has been executed, then 

FIRST (PAT) 
returns the pattern which is the value of X. 

REST (Pattern) Returns: Pattern 

REST() is the complement to VI^jstO; i<^ also accent:-, 
alternated or concatenated patterns as arguments, and 
returns all but the first element. Th\:s, if 



A. Sumrrary of Predefined Procedure? ^32 



PAT = X Y I Z 
has been executed, then 

REST (PAT) 
returns the pattern which is the value of 7,. If, however, 

PAT = X (Y I Z) 
has been executed, then 

REST (PAT) 

returns the pattern which is the value of Y | Z, an 
alternation. 

LEFT (Pattern) Returns: Pattern 

LEFTO accepts as an argument a Pattern constructed by 
an immediate assignment or conditional assignment operator; 
it returns the pattern which is the left-hand operand of 
that operator. Thus if 

PAT = ANY (VOWELS) . V 

has been executed, then 

LEFT (PAT) 

returns the pattern which is the value of the expression 
ANY (VOWELS). 

RIGHT (Pattern) Returns: Name 

RIGKT{Nane) Returns: String 

RIGHTO may have a pattern constructed by an assignment 
operator, in which case it is the complement to LEFT(). For 
instance, if 

PAT = ANY (VOHELS) $ V 
has been executed, then 
RIGHT (PAT) 

returns the value of the expression .V, the Naae of the 
variable V. 



A. Summary of Predefined Procedures 133 



RIGHT () may also have as arguinpnt a deferred evaluat-.ion 
pattern, in which case it returns the Name of the oporand of 
the deferred evaluation operator. If 

PAT = *V 

has been executed, then 

RIGHT (PAT) 

returns the value df the expression .V, the Name of the 
variable V. 

Finally, RIGHTO may have as its argument the Name 
(datatype Name) of a natural variable,, in which case it 
returns the ."String which is the other nane of that variable. 
(RIGHTO will not accept the Name cf a created variable, nor 
the String name of a natural variable.) Thus, the value of 
RIGHT (.V) is the String V; the statements 

PAT = ANY(VOWELS) $ V 
OUTPUT = RIGHT (RIGHT (PAT) ) 

will print the character V. Since objects of datatype Name 
cannot be printed, it is the RIGHTO procedure whicli 
converts Names of natural variables into a form suitable for 
assignment to OUTPUT. (To print Names of created variabl'^s, 
see FAMILY and SELECTOR () below.) 

FAMILY (Name) Returns: Array or structure 

FA^ITLYO accepts as argument the Name of a creatpd 
variable (array item, or field of a programmer-defined !iat?i 
structure) . It returns the object which is the family of 
variables to which the Named variable belongs. If LIST has 
been assigned an array as value as in 

LIST = ARRAY ('0: 10«) 

and the rule 

RLEHFNT = .LIST[5 1 

has been executed (notice that the value of ELEMFNT is of 
datatype Name) , then 

FAMILY (ELEMENT) 

returns the Array which is the value of LIST. Similarly, 



A. Summary of Predefinea Procedures l"^** 



after the statements 

DATn(«NODE(LLINK,HI,INK,IKFO) ') 
NEXT = N0DE(,,1'S) 
ELEMENT = .INFO (NEXT) 

have beon executed, then 

FAMILY (ELEMENT) 

returns the object of datatype Node which is the value of 
NEXT. 

Since FAMILY (» returns the Array or structure rather 
than the Name of the variable whose value is the Array or 
structure, the value of FAMILY () is suitable for use as the 
first argument of ITEM(), or a second argument of APPLY (). 

SELECTCR (Name) Returns: String 

SEIECTOEO is the other half of FAMILY () . It also 
accepts as its argutaeiic the Name cf a created variable, and 
returns a String which may be used to select that variable 
in its familv. For Arrays, SELECTOR () returns a stri^^g which 
is a list of' indices; for structures, SELECTOR () returns a 
string naming a field selection procedure. The String 
returned by SELECTOR () is appropriate for use as the first 
argument of APPLY () , or a second argument of ITEM(). (Note 
that this last use takes advantage of the fact that ITEM() 
will accept such a String of indices; only in the case of 
one-dimensional Arrays may the value of a call to SELECTOR () 
be used within square brackets in an item reference. ) 



A. Summary of Predefined Procodiires 135 



II. SYSTEM PROCEDtlFES 

IT. A De^clarations 

DEFINE {String, String) Returns: null value 

The first argument of DEFINE () is a string consir^ting 
of the name of the procedure being defined, followed by a 
pair of parentheses containing the nainef- of the formal 
variables (if any), which in turn are followed (without a 
coiama) by the names of internal variables (if any) - The 
second argument is a string naming the "entry label" for the 
procedure; if the second argument is null, the entry label 
is assumed to have the same form as the name of the 
procedure being defined, 

DEFINE ('PRINT (N, NAME) M,W, F«) 
DEFINE { 'RECORDS « , ' PR. RECOUDS •) 

TATA (String) Returns: null value 

The DATAO declaration has as its argument a prototyp-^ 
string consisting of the nasne of the datatype being defmod, 
followed by a parenthesized list of the names of the fie"".d5i 
which an object of that datatype is to comprise (if any) . 
The effect of the DATA () declaration is to define (without 
any DEFTNE()'s) a structure creation procedure for th^ 
datatype, along with a field selection procedure Cor each 
field. Thus, after the declaration 

DATA ('NODE (LLINK, R LINK, INFO) ') 

has been executed. Node's may be created with statements of 
the form 

NEXT = NODEO ; CHRRENT = NODE (NEXT, , TRI M (I NPUT) ) 

Fields of the created structure have values initialized 
according to the values of the corresponding arguments of 
the procedure call; null arguments produce null fields. 

The variables which are fields of structures are 
referred to by field references, consisting of a reference 
to a field selection procedure with an argument of the 
proper datatype to specify the family; for the example 
above, by statements of the form 



A. Suirmary of Predefined Procedures 136 



LEFT = LLINK (CORRENT) 
NAME = INFO (NEXT) 
PLINK (CUHRENT) = NEXT 

The sasse field name may be used in definitions of more than 
one datatype, since its interpretation is governed by the 
datatype of the argument in any field reference. Notice, 
however, that the names of structure creation procedures and 
field selection procedures are drawn from the same set as 
all other procedure narjes, so that (for instance) defining a 
structure 

DATA ('ENTRY (TYPE, SIZE, INFO) •) 

will re-define the predefined procedures TYPE() and SIZE () 
as field selection procedures for objects of datatype Entry. 



II , E Access_to_S^stem_Inforniation 
SIZE (String) Returns: Integer 

SI7.E() returns the integer length (the number of 
characters) of the string which is its argument. 

SIZE(VOHELS) ; SIZE (TEIM (INPUT)) 

DATATYPE (any) Returns: String 

DATATYPE () returns the string of characters which is 
the name of the datatype of its argument (predefined or 
programmer-defined). It is used for controlling branching, 
and can be used with IDENT() to simulate other test 
procedures. To test whether COUNT is an integer, write 
IDENT (DATATYPE (COUNT) ,• INTEGER') . 

DATATYPE (COUNT) ; :($('L' DATATYPE (VAL) ) ) 

TYPE (any) Returns: String 

TYPEO returns the same result as DATATYPE() for 
objects of predefined datatypes, and the string DATA for 
objects of programmer-defined datatypes. Thus, an exhaustive 
lifsting of the strings returned by TYPE() is: 

STRING INTEGER PEAL PATTERN 

ARRAY NAME CODE DATA 



A. Sutnmary of Predefined Procedures 137 



PPnTCTYPE(Array) Peturns: String 

PROTCTYPE (strticture) Returns: String 

PROTCTYPE(Pattern) Returns: String 

PROTOTYPE (Name) Returns: String 

PROTOTYPE {) returns as its value a String representing 
the system definition of the object which is the value of 
its argument. Its operation is rather different according to 
the datatype of its argument. In each case, the string 
returned is intended to be convenient for investigdtion by 
Snotol pattern-natching. 

When the argument of PPOTOTYPE() is an object created 
by a call to the predefined structure creation procedure 
ARROYO, the string returned is the list of upper and lower 
bounds of indices for the dimensions ~ essentially the same 
as the argument given to the ARRAY (i procedure, except that 
lower bounds are always explicitly nresent, and each integ'^r 
is in canonical form (no signs fcr positive numbers, no 
leading -zeroes) . Thus, if the rule 

LIST ^ ARRAY(«00:5,-1: + 3,05M 
has teen executed, then 

PROTOTYPE (LIST) 

will return the 12-character string 0:5,-1:3,1:5. 

When the argument of PROTOTYPE () is an object of a 
programmer-defined datatype — one created by a call to a 
prograitimer-def ined structure creation procedure — then the 
string returned is that defining the datatype of the object. 
This is the same as the string which was the argument of tho 
call to the DATA() procedure which declared the datatype — 
not the argument list of the structure creation procedure 
which created the object (unlike the case for Arrays). Thus, 
if the two statements 

DATA('NODE(LLINK,RLINK,INFO) ») 
CURRENT = NOnE(LAST,, 'SCNNETnlS') 

have been executed, the value of Crjpr<ENT is an object of 
datatype Node, with its LLT»JK() and TNFO() fieldr^ 
initialized as shown and its RLINK() field null. Then tho 
rule 

PROTOTYPB(CURRENT) 

would return the 22-character strinq NODE (LLINK, PLTNK, INFO) . 



A. Summary of Predefined Procedures 



138 



For 
PROTCTYP 
the resu 
deter rain 
— items 
that for 
to the p 
structur 
the data 
obtain 
COPI^TiNT 
pattern 
delimite 
way as 
second a 



both arrays 
E() is an ob 
It returned 
e all the va 

or fields, 

arrays this 
redefined st 
es this in 
type.) Tn th 
the values 
by obtaining 

between 
d by commas, 

the first 
rgument. 



and data structures, the argument of 
ject which is a family of variables, and 

is a string which can be used to 
lid selectors for members of that family 
as the case may be. (The difference is 

information is provided in the argument 
ructure creation procedure, for data 
formation is given in the declaration of 
e last example, for instance, one could 

of the fields of the object named by 

its PROTOTYPTi: , then searching with a 
the parentheses to find the strings 

and using the strings located in this 

argument of APPLY () with CURRENT as the 



and 

these 

they 

may w 

many 

plus 

Patte 

proto 

to t 

(see 

Patte 

that 

prede 

hand 



This idea 

datatype 

datatype 

may hav 

ish to in 

parts, 
a selecto 
rns and 
types, st 
he names 
section T 
rns and 
of progra 
fined pr 
column of 



is extended to objects of datatype Pattern 

Name, by observing that although objects of 
s are not families of variables, nevertheless 
e an internal structure which a Snobol program 
vestigate. A Pattern may be constructed of 
fcr instance, and a Kame may indicate a family 
r. For this reason, the different kinds of 

Names are provided with predefined system 

rings which contain substrings corresponding 

of the predefined field selection procedures 

.C of this appendix). Thus, the structure of 

T'3ames may be investigated in the same way as 
mmer-defined data structures. The twenty-one 
ototypes fcr patterns are given in the right- 

the following table. 



predefined pattern, variables 



p 


= 


ARE 


p 


= 


REM 


p 


= 


EAL 


p 


= 


FENCE 


p 


= 


FAIL 


p 


= 


ABORT 



PROTOTYPE (P)-> 
PROTOTYPE (P)-> 
PROTOTYPE (P) -> 
PROTOTYPE (P)-> 
PROTOTYPE (P)-> 
PROTOTYPE (P)-> 



RRB 
REM 
BAL{) 
FENCE 
FAILO 
ABORT 



A. Summary of Predefined Procedures 139 



predefined Pattgrn_ procedures 



p 


— 


LEN(6) ; 


p 


= 


FCS(6) ; 


p 


= 


FPCS(6) ; 


p 


■s 


TAB(6) ; 


p 


= 


PTflB(6) ; 


p 


= 


ANY('AEIOn') ; 


p 


= 


NOTANY (' AEIOU') ; 


p 


■= 


SPAN (• AEIOn«) ; 


p 


= 


EREAK(«AETOn«) ; 


p 


= 


ABPNO (ANY (« AETOTJ') ) 



PROTOTYPE (P) -> LEN(PARAM) 
PFOTCTYPF(P)-> POS(PARAM) 
PROTOTYPE (P) -> RPOSCPARAM) 
PROTOTYPE (P) -> TAP{PARAM) 
PROTOTYPE (P) -> RTAB{PARA!1) 
PROTOTYPE (P) ~> ANY(PARAM) 
PROTOTYPE (P) -> NOTANY (PARAM) 
PROTOTYPE (P) -> f^PAN(PnRAM) 
PROTOTYPE (P)-> BREAK (r'ARAM) 
PROTOTYPE (P) -> ARDNO{PA0AN!) 



alternation and concatenation 

P = 'A* I 'B* I 'C ; PROTOTYPE (P)-> ALT (FTR ST, REST) 

P = 'A» ANY('AEIOn') 'C ; PROTOTYPE (P) -> CAT (FTR ST, REST) 



assign me nt pgerators 

P = SPAN CAETOIP) . VOHFLS ; PROTOTYPE (P) -> PRD (L E7T, R TGHT) 
P = PRHAK(« AEIOt)') $ VOWELS ; PROTOTYPE (P) -> DOL (LEFT, RTGliT) 



def erred_eyaluation 

P = *VOHEL ; PROTOTYPE {P)-> STAR(RrGlIT) 

Similarly, a Name may be the name of a natural variable 
(one that is also named by a String), or one of the two 
types of created variables — an Array item, or a field of a 
data structure. There is a predefined prototype for each of 
these: 

VAR = .VOWELS ; PROTOTYPE (VAR) -> INDIRECT (RIGHT) 

VAR = .LIST[I,J] ; PROTOTYPE (VAR) -> ITEn ( FA^IILY, SELEC-'O R) 

VAR = .RLINK(NODE) ; PROTOTYPE (VA R) -> APPLY (SELECTOR , FAMILY) 

Notice that the Name of a natural variable, returned by 
the name operator, is a suitable a,.qument for PROTOTYPE(); 
the Strinq which names the same variable (in the oxaraplo 
above, VOWELS) would cause an execul ion- time error as -^n 
argument of PROTOTYPE (). 



A. Sutntnary of Predefined Procedures 



lao 



ALPHABET Returns: String 

ALPHABETO returns the 63-character string which is the 
Snofcol character set in standard collating sequence (see 
Appendix I) . 

ALPHABETO 



DATE Returns: String 

DATE() returns a nine-character string representing the 
current date, in the form 02nJllLn72. The abbreviations used 
for the months are the first three letters of their names. 

DATEO 

CLOCK Returns: String 

CLOCK returns an eight-character string representing 
the time of day at which the job is being run, in the form 
19:03:'>7. Hours are counted from zero through twenty-three, 
minutes and seconds from zero through fifty-nine. 

CLOCK {) 

TIMiiO Returns: Integer 

TIMEO returns the elapsed central processor tine for 
the job, expressed as an integer number of milliseconds. By 
subtracting the value of one call to TIHEO from the value 
of a later call, a programmer is able to determine the 
amount of central processor time used by a particular part 
of his program, 

TIHEO 

STCCDNTO Returns: Integer 

STCOllNTO returns the count kept by the Snobol system 
of the number of statements on which execution is begun. Its 
initial value is, of course, zero when a program starts 
executing. 

STCOUNTO 



A. Suinniary of Predefined Procedures 



141 



STLIMIT(Inteqet) 



Returns: Tnteqer 



STLIIITO is used to set the limit on the number of 
statements executed (the value of STCOdNTO ). Its initial 
value is 1,000,000; lower lifnits may b^ set hy the 
programmer by calling STLIHIT() with a non-null integer 
argument. An execution-time error results if STLIMTT() is 
exceeded. If called with a null 



a nil 1 I a ranment . 

-- '-_/ — w 

its current value and remains unchanged. 



c;TT.TMTT t\ r<3l-,irnc; 



STLIf1IT(«200«) 



STLIMIT{5000) 



STLIHITO 



HAXLNGTH (Integer) 



Returns: Integer 



MRXLNGTHO is used to set the limit on the length of 
strings which may be formed, in characters. Its initial 
value is 131,070; lower limits may he set by a programmor by 
calling MAXLNGTH() with a non-null integer argument. An 
execution-time error will result if an attempt is made to 
exceed this maximum length for strings. If called vith a 
null argument, MAXLNGTH () returns its current value and is 
unchanged. 



MAXLNGTH(«200') 



HAXLNGTH (5000) 



MAXLNGTHO 



FNClEVl^l 



Returns: Integer 



FNCLEVELO returns an integer value to indicate the 
level of evaluation of nested or recursive procedure calls. 
Its use is to provide a trace of the evaluation for 
debugging of program logic, or to preserve a record of the 
level of evaluation causing a failure during execution. (At 
an execution-time error, this information is displayed by 
the system's error message.) 



REMARK (TIHEO 



t • 



FNCIEVELO 'nDEEP') 



NEXIVAR (Name) 
NFXTVAR{String) 



Returns: Name 
Returns: Name 



NEXTVARO accepts as its argument the Name of a created 
variable, or cither the Name or String naming a natural 
variable. 



For created variables — array items or fields of data 
structures — NEXTVAP () return.s the name of tho "next" 
member of the same family. For Arrays, names of items arf» 



A. Sumnary of Predefined Procedures ^^2 



return€d in the order obtained by varying the rightmost 
index most rapidly. For data structures, names of fields are 
returned in left to right order of their appearance in the 
DATA{) declaration which defined the datatype. In both 
cases, the order is cyclical, the name of the "first" member 
of a family (under this definition) being the value of 
NEXTVSBO applied to the name of the "last" member. Thus, if 
the rule 

LIST = RRRAY (»0:2,0:2») 

has been executed, the value of NEXTVAP (.LIST[ 0,0 ]) is the 
name of the array item referred to as LIST[0,1], and the 
value of NEXTVAR{.LISTi: 2,2]) is the name of the array item 
referred to as IIST[0,0]. Similarly, if the rules 

DATA(«NODE (LLINK, RLINK, INFO) •) 
CDSRENT = NODEO 

have been executed, the value of HFXTVAR(. LLINK (CURRENT) ) is 
the name of the field referred to as RLINK (CURRENT) , and the 
value of NBXTVRR(.INFO(CURRENT)) is the name of the field 
referred to as ILINK (CURRENT) . 

If a statement such as 

NEXT = NEXTVAR(NEXT) 

is iiritten in a loop, then the names of all the members of 
the family to which the value of NEXT belongs will be 
returned in order; but unless the programmer checks to see 
when he is back to where he started, the loop will be 
infinite. A suitable loop for going once through the fields 
of a Node, then would be 

SAVE = .LLINK (CURRENT) 
NEXT = SAVE 
LOOP [statements to process a field] 
NEXT = NEXTVAR (NEXT) 
IDENT(NEXT,SAVI) : F(LOOP) 

NEXTVAR is convenient for referring in turn to all 
the variables of an array or a data structure, but its 
effect can be programmed in Snobol using PROTOTYPE () , 
ITEMO, and APPLY () . (See an example of this in Chapter 7.) 

The more important use of NEXTVAR () arises from the 
fact that it also treats the set of all natural variables as 
a "family," and thus when given a String or a Name which 
names a natural variable, NEXTVAR () returns the name of 



A. Summary of Predefined Procedures 143 



another natural variable. Two important differences of 
NEXTVARO in this use should be noted. First, since there is 
no defined order for the natural variables, their names are 
returned in an order which is convenient for NEX?VAR(). 
Second, NFXTVAR() cannot cycle throuqh the names of all the 
natural variables, since there are an infinite number of 
them. Hence, it returns the naraes of a subset of the family 
of natural variablps which is certain to include at least 
the names of all variables with ncn-null values, and may 
also include the names of soire variables with null values. 
What is important is that by the time a full cycle has been 
completed and the starting place reached again, the name of 
every variable with a non-null value will have come up. 
(When used with families of created variables, by contrast, 
NEXTVARO is guaranteed to cycle through thp names of every 
variable in the family in turn, regardless of their values.) 
Observe that the names returned by NEXTVARO are subject to 
the usual interpretation of naraes. In particular, if 
NEXTVARO is called repeatedly in a loop within the body of 
a programmer-defined procedure, and some process is carried 
out on the variables referenced by the naraes returned, then 
the names of variables internal to procedure calls will 
refer to those internal variables. The customary 
interpretation of what variable a name refers to at any 
point in the execution of a program is not affected by 
NEXTVARO . 



1 1 . C Rsan^st s_f o r_ S_YS tem_ Ac t ion s 

ITEH (Array, String, ..., String) Returns: variable, or fails 

ITEM provides a convenient way to write item 
references for arrays chosen at execution-time, for arrays 
which are the values of array items, or which involve 
variable numbers of dimensions. The first airgument of ITEM () 
is an array, and the following arguments are either integers 
or else lists of integers separated by commas. TTE»1 () 
constructs an item reference using the array which is its 
first argument for the family and the proper number of 
indices gathered from the remaining arguments to form the 
selector, ignoring extra indices and supplying null (zero) 
for missing ones. ITEM () NRFTHRNs the array item so 
referenced, or FRETURNs if any index of the selector exceeds 
the bounds specified by the prototype for the array. If TTC3 
has been assigned the value 

TIC3 = ARRAY (' 1:5,1 :5, 1 :3') 



A. Summary of Predefined Procedures I**** 



then equivalent ways of referring to its central item are 



TIC3[ 3,3,2] 
ITKH(TIC3,3,2,2) 
TTEK(TIC3,«3r3,2») 
ITEM{TIC3,3,'3,2») 



APPLY (String, any,. .. ,any) Returns: any or variable, or fails 

APPLY () provides the only way to write procedure 
references for procedures chosen at execution-time. The 
first argument of APPLY {) oust be a string which names a 
procedure; the Snobol system calls that procedure, using as 
its arguments the remaining arguments of APPLY () and 
observing the usual conventions for extra or missing 
arguments. APPLY () returns the value returned by the 
procedure it calls, using the same return (RETORN, NRETHRN, 
or FPETUR?}) . 

If APPLY is used to call a field selection procedure, 
then its use is analogous to the use of ITE1() for item 
references; the Snobol system forms a field reference using 
the first argument as the selector and the second argument 
for the family, and NEETURKs the field so selected. 

FLD = 'ELINK* 

APPLY ( FLD, CURRENT) = TPIM(INPnT) 

RLINK {CURRENT) = APPLY ( •TRIfi' , INPUT) 



IF () Returns: null value 

IF() always succeeds. Since it is defined to have no 
arguments, any arguments in a reference to IF{) are 
evaluahed but otherwise ignored. Thus if any part of that 
evaluation fails, that failure causes failure of the rule. 
If a reference to a procedure returning a non-null value is 
written as an argument of an IF() procedure, the combination 
will work like a test procedure. The same principle applies 
to ether expressions returning values which can similarly be 
converted into test procedures. 

N = IP(ARR1[N*1 ]) N + 1 : F(OUT) 



A. Summary of Predefined Procedures 145 



ANCROR (any) Returns: null value 

ANCHOR works like a switch, distinquishina betweon 
null and non-null arguments. Callinq ANCfiOR () with a non- 
null arqumont turns on the anchored mode of pattern- 
ntatchinq; callinq it again with a null argument restores th^ 
usual, unanchored mode. 

ANCHOR {'ONM ; ANCHOR (OFF) ; ANCHOR {) 



CONVERT (Integer) Returns: Real 
CONVERT (String) Returns: Real 

CONVERT (Real) Returns: String 

CONVERT is useful for creating and printing real 
numbers. If its argument is of datatype Integer, the valu» 
returned is the corresponding real number. The only 
permissible String-valued argument is a string of digits, 
possibly including an initial sign and possibly including a 
decimal point; the returned value is the corresponding real 
number. If the argument is of datatype Real, the value 
returned by CONVEi^T() is the numeral string representing thp 
real number to twelve digits. CCNVERT() is defined for 
integers and real numbers from about iq-^oo to about lO^oo. 

C0NVERT(a5) : CONVERT ('-57.69') ; CON VERT ( • . 75 • ) 
CONVERT (REALNUMB) ; CON VE PT (TRIM (INPHT) ) 



CODE (String) Returns: Code 

CODE() accepts as its argument a string which is a 
Snobol program text; that is, a sequence of syntactically- 
correct Snobol statements (see the definition of the 
construct <program text> in the syntax. Appendix J) , and 
returns as its value the corresponding compiled Code; its 
use, then, is to permit a program to extend itself while it 
is executing. All characters in the Snobol character set, 
including space, have their customary significance in the 
argument to CODE(). statement separators are semicolons, but 
no final semicolon is required in the string. 

NULP = CODE ('LOOP BLWORD "A" = ;• 
♦ ' N = LT(N,X) N ♦ 1 : S{I,OOP) F(S("L" X))«) 



K, Suainary of Ptedefinefl Procedures "''♦^ 



II D , I n£iliZO « t£u t_ P r oced u re s 

INPni (String, String, String) Returns: null value 
INPUT (Name, String, String) Returns: null value 

INPOTO is used to associate a variable in a Snobol 
program with an input file. The first argument is the name 
of a variable to be used in the program; the second argument 
specifies a SCOPE fileset; the third argument specifies the 
number of characters to be read from each record on the 
file. (Excess characters are lost; missing characters are 
filled out with spaces.) If the variable is already 
associated with a file, it loses its previous association. 
It is through INPUT () — and OUTPOTO — procedures that the 
Snobcl program establishes contact with the files set up for 
it by SCOPE, 

I NPOTC READ «,• INPUT', •50«) 
INPUT (•LNGI!EaDER», ♦ DISKSRT' ,600) 
INPUT(.LIST{: 12],'TAPET,TRIM(INPUT) ) 
INPUT (.LLINK (NEXT) , • INFILE' , 80) 

OUTPUT (String, String, String) Returns: null value 
OUTPUT (Name, String, String) Returns: null value 

OUTPUTO is used analogously to INPUTO, to associate 
variables in Snobol programs with SCOPE filesets which are 
to be used for output. The first argument is the name of a 
variable to be used in the Snobol program; the second 
argument specifies a SCOPE fileset; the third argument is 
the carriage control character which will be concatenated at 
the head of every record written. (If omitted, none will be 
concatenated.) If the variable is already associated with a 
file, it loses its previous association. 

OUTPUT ('WRITE* , 'OUTPUT* , '-') 
OUTPUT ('PAGE', •DISKFII',1) 
OUTPUT (.LISTf 13],«TAPE1 •,a») 
OUTPUT ('PUNCH •, 'PUNCH') 
OUTPUT (.R LINK (NEXT) ,'GUTF1LE') 



A. Summary of Predefined Procedures ^^^ 



DETACH (String) Returns: null value 

DETACH (Name) Returns: null value 

DETACH is used to break the association between the 
variable named by its argument and any filoset. There is no 
need to DETACH () an associated variable before giving it a 

ut-ih kA^^v>v^j.ai^XUii. \ A V cX 1. A. ui U J. k:: hi ay ut: an^n^(^j.ciL'::f'i WJ_L n UlliV OOO 

fileset at a time, but a fileset niay have many variables 
associated with it simultaneously.) 

DETACH ('OaTPUTM 
DETACH(«WRITE») 
DETACH (.LISTf 12]) 
DETACH (.RLINK (NEXT) ) 



ENCGFOUP (String, Integer) Returns: null value 

E?n)GROnP() writes a SCOPE end-of-group maric on the 
SCOPE fileset which is specified by its first argument. The 
"level" associated with the mark is specified by the second 
argument, which must be an integer between and 1 "S 
inclusive. Such a mark of any level will cause failure on 
input if later read by a Snobol program. 

EWDGEOnP(«TAPE20«,9) ; ENCGROUP (• DISKFTL • ) 



REWIND (String) Returns: null value 

REWIND performs a standard SCOPE rewind on the SCOPE 
fileset specified by its argument, '''he fileset is positioned 
at its beginning; if the last operation on this file was a 
write, an end-of-group mark of level zero is written before 
the file is rewound. 

REWIND(»TAFE20«) ; REWIND (• EISKFIL » ) 



REMARK (String) Returns: null value 

PEi^lARKO is used to write the string which is its 
argument onto the special file which is the job log. obvious 
uses are to preserve messages about the course of execution 
associated with timing information, and to decorate the 
dayf iles. 

RFMAHK (•ENTERING FREEZE TC TAFE20.') 
PI^^'ARK ('MOTHER IS DEAD.') 



A. Summary of Predefined Procedures '^'^^ 



FPFEZE (String) Returns: String 

FBEEZEO is a procedure which permits a programmer to 
suspend execution of a compiled Snobol program, and then to 
re-load it and re-commence execution. The argument to 
FREEZE is a string vhich is the name of a SCOPE fileset. 
Vhen FREEZE {) is encountered during execution, the Snobol 
system writes out a copy of the entire field length of the 
job onto the fileset specified by the argument, and 
execution is terminated. SCOPE then reads and carries out 
the next control card. When SCOPE finally hits a control 
card asking that the Snobol program be reloaded, it does so 
and execution continues from the point where it was frozen. 

On a call in a program such as FREEZE {•TftPE20 •) , the 
program is "frozen" onto SCOPE fileset TAPE20. Execution 
begins again when a SCOPE control card is encountered of the 
form LGO,Tf.PE20. There is no requirement, naturally, that a 
frozen orogram be loaded and executed in the same job in 
which it was written out; it can perfectly well be saved on 
a CCMilON file, or on tape, or even punched out on cards. 

It is a peculiarity of FREEZE () that it returns for its 
value the string which is its argument. This could be used 
to preserve a record of which of several FREEZE()'s had been 
executed, but FREEZE {) is customarily written where its 
returned value is not preserved. 

FREEZE ('DISKFIL') 

EOI (String) Returns: null value, or fails 

EOIO tests whether the SCOPE fileset specified by its 
argument is positioned at the end-of-inf ormation on the 
file. If so, the procedure succeeds and returns the null 
value. If there is more information on the file, the 
procedure fails. 

FOI(«TAPE20«) : S(OUT) 

EORIEVEL (String) Returns: Integer, or fails 

ECRLFVELO tests to see whether the SCOPE fileset named 
by its argument is positioned at an end-of-group mark; if 
so the level associated with the mark is returned as the 
value of the procedure call. (Such a mark is written by the 
ENDGROUPO procedure; the value . returned by EORLEVEL () is 



A. Summary of Predefined Procedures 



149 



the second parameter of the ENDGROUP{) which 
tc 15 inclusive.) If the fileset is positioned at end- 
information — if the EOT procedure would succeed — 
value returned by EORLEVEL () is -1. 



of- 
the 



As a practical matter, a fileset will only be 
positioned at an end-of-group mark if the last reference to 
a variable associated with that fileset failed: customarily, 
then, a call to E0RISVEL() would only be made after' a 
failure on input had occurred, to check the level of ' the 
end-cf-group mark which caused the failure. If a call to 
EOHIEVKLO is executed at any other time — at any time when 
the fileset is not at an end-of-group mark — the call to 
EOBIEVFLO will itself fail. 



EC(EORLEVFL{»TAPE20') ,9) 
LVL = ECRLEVEL('DISKFIL') 



S(NINE) 



150 



Appendix B. SOMMARY OF PREDEFINED PATTERN VARIABLES 

There are precisely six variables initialized to a 
value other than the null value when execution of a Snobol 
program begins: the six natural variables named AP B, REM, 
PAI, FAIL, ABORT and FENCE. Each of these has a pattern as 
its initial value, but except for this initialization 
receives no special treatment. Each may be assigned any 
value by a program, upon which its initial value is lost. 
This makes no great difference for ARR, REM, BAL, or FAIL, 
but the value of ABORT is a pattern which cannot be 
constructed in any other way ty a Snobol program, and FENCE 
can be constructed only with the use of ABORT. 

ARB and_.REH. The patterns which are the initial values 
of ABB and REM are equivalent in effect to two commonly used 
patterns which may be constructed by pattern procedures. ARB 
is equivalent to the value of the expression ARBNO(LEN (1) ) ; 
REM is equivalent to the value of the expression PTAB (0) . 
The Snobol system can and does distinguish between APB and 
ARBNO (IEN(1)) , or between REM and RTAB (0) ; an IDENT() 
comparison of such a pair will fail, and PROTOTYPB() will 
return different prototype strings for them. But the 
performance of either member of a pair in a pattern- matching 
statement is exactly the same. 

BAL. BAL has as its initial value a pattern which 
matches any non-null string of characters which is 
"balanced" with respect to parentheses — that is, which has 
the same number of left and right parentheses, including 
none, where each left parenthesis occurs before its matching 
right parenthesis. A pattern equivalent to the initial value 
of BAL can be constructed in Snobol, thus providing a 
precise definition of its action: 

BALEXP = NOTANY (•()') I M' ARBNO (*BALEXP) •)' 
BAL = BALEXP ARENO (BALEXP) 

Again, the system distinguishes between the predefined BAL 
and the pattern constructed by the rules above, but the two 
would perform in the same way in a pattern match. 

FAILj. FAIL has as its initial value a pattern which 
matches^no strings (not even the null value), and which thus 
always fails. This makes it the "empty" pattern alternative 
— one which may be present in any pattern without altering 
the set of strings matched. The expressions FAIL i LPAT and 
LPAT will match the same set of strings, no matter what 
pattern is the value of LPAT. A pattern which would have the 



B. Summary of Predefined Pattern Variables 151 

same effect could be constructed by the rule 

FAIL = ANY (NULL) 

One use for the empty pattern alternative is to 
construct an alternated pattern from data. For instance, 
with the statements 

IN. PAT = FAIL 
PATLOOP IN. PAT = IN. PAT ! TRIM(INPUT) : S(PATLOOP) 

Here the loop statement extends the alternatives of IN. PAT 
by one more each time it is successfully executed. If thp 
data read were the first three letters of the Greek alphabet 
spelled out on cards, followed by failure of INPUT, then the 
resulting pattern would be equivalent to 

IN. PAT = FAIL I 'ALPHA* | 'BETA' | •GAMMA* 

which matches the same set of strings as does 

IN. PAT = 'ALPHA' | 'BETA' 1 'GAMMA' 

Note that if IN. PAT had not been first assigned the value 
FAIL, the resulting pattern would have been equivalent to 

IN. PAT = NULL I 'ALPHA' | 'BFTA' | 'GAMMA' 

which is rather different — since it will match the null 
value (as its first alternative, in fact), it will always 
succeed. 

AFORT . ABORT has as its initial value a pattern which 
causes immediate failure of an entire pattern match when it 
is encountered. The usefulness of ABORT is that it permits a 
pattern match to fail if something is found. For instance, 

SH.PAT = LEN(IO) ABORT | »:• 

is a pattern which will fail by ABORT if it is set to search 
a string of ten or more characters; shorter strings it will 
search for a colon. It will succeed, then, only on a string 
of nine or fewer characters containing a colon. More 
generally, patterns which have characteristics q but not <j 
can often bo written In the form <j ABORT I [» . 

IMCFi The initial value of PENCK is a pattern whir-h 
has the following interesting property: when encountered in 
a pattern match it matches the null value, and then if tho 
remainder of the pattern cannot he succesfully matched from 



B. Summary of Predefined Pattern Variables 152 



that point, the match will fail. A pattern which would have 
the same effect could be constructed by the rule 

FENCE = NULL | ABORT 

When FENCE is used as the first element of a pattern, 
its effect is like writing POS(O); it "anchors" the pattern 
so that it must match beginning with the first character. 
When FENCE is used after other pattern elements, then its 
effect is that of a conditional ''anchor" applying only to 
the remainder of the pattern, and only if the elements to 
the left of FENCE within its alternative have been 
successfully matched. 



15 3 



Appendix C. SQMKAPY OP OPERATORS 



Operator £;£££§ t ion £ES£§^ence 

unary * deferred evaluation 7 (highest) 

unary . name 7 

unary $ indirect reference 7 

binary . conditional assignment 6 

binary $ immediate assignment 6 

binary * multiplication 5 

binary / division 5 

unary ♦ plus a 

unary - minus H 

binary + addition 3 

binary - subtraction 3 

binary a concatenation 2 

binary 1 alternation 1 (lowest) 



15t» 



Apppnrlix D. SUMMARY CF PFOCErURE EXECUTIOM 

When a call is made to a programmer-defined procedure: 
(1) the arguments are evaluated; (2) the variable name which 
is the same as the procedure name is made to refer to an 
internal "result variable": (3) the formal variable names 
are irade to refer to internal "formal variables"; (U) any 
additional names in the first argument of the DEFINE () 
procedure are made to refer to additional internal 
variables; (5) the formal variables are assigned the values 
of their corresponding arguments; (6) the result variable 
and all additional internal variables are assigned the null 
value; (7) control passes to the statement of the procedure 
body whose label is specified by the second argument of the 
DEFINE {) procedure (this may be expressed by default); (8) 
execution of the statements of the procedure body continues 
until a return transfer is executed. 

When return is made from a procedure using RETURN: (1) 
the last value assigned to the result variable is returned 
as the value of the procedure call; (2) the variables 
previously referred to by the formal variable names, the 
result variable name, and any additional internal variable 
names, are restored; (3) execution of the calling statement 
continues from the point of the procedure call. 

When return is made from a procedure using NRETURN: the 
variable nsmed by the last value assigned to the result 
variable (which must be a string or a Name) is returned as 
the value of the procedure call; the remaining actions are 
the same as for RETURN. 

When return is made from a procedure using FRETURN: (1) 
the variables previously referred to by the formal variable 
names, the result variable name, and any additional internal 
variable names are restored; (2) the call fails, the rule 
from which the call was made fails, and control is returned 
to the go-to of the calling statement where the failure 
transfer will be taken. 



155 



Appendix H. PROGRAM TEXT REPRT^SENTATION 

Each statement of a Snobol prcqcain is usually punched 
on a separate 80 column card. Only the first 72 columns, 
however, may be used for the statement; the remaining 
columns may be used for purposes of identification. (For 
example, sequence numbers may be punched there which would 
allow you to put the deck back in order, either by hand or 
with a mechanical sorter, if the cards should be 
disarranqed.) All columns of the card appear in the printed 
listing of the program when it is executed, but 10 spaces 

ar^ DT'^VT^pd Vt£i4'TJri/7»r» j^«^1t1mnr^ 1 '^ ^ r^ 3 ■»-> 1.- .' 

. J. _ I- J. _ » — ^j «v,^*s.t.i v.,vjiaiinici I i. avm li tu tsepdrdite any 
identification from the statement. 

Statgraent_Format^ If the label of a statement is 
present it must be punched starting in column 1. If the 
label is absent and the rule is present, then column 1 must 
be left empty and the rule may be punched beginning in 
column 2 or beyond. If the statement consists only of a go- 
to, the colon introducing it may be punched in column 1. 

Wherever a single blank occurs in a statement, any 
number of blanks would serve as well; wherever many blanks 
occur, a single blank would serve as well, since all parts 
of a statement may be absent, a totally blank card ir. 
treated as a null statement. 

The semicolon may be used as a delimiter between 
statements, making it possible to punch more than one 
statement per card. The semicolon signals the end of a 
statement, so the column directly after the semicolon is 
treated as "column 1" of the following statement. For 
example, four assignment statements may be punclied on a 
single card as follows: 

ONE = 1; TWO = 2; THREE = 3;LAST FOUR = ^ 

Note that the final statement of the sequence has a label, 
while the others do not. A semicolon is assumed at the end 
of a card which is not followed by a continuation card. 

Continuaticn_Cards^ More commonly, a method is needed 
for dealing with statements which arc too long rather than 
too short. Statements which are toe long to fit on a single 
card may be continued onto as many cards as necessary. This 
is done by means of continuation cards, each of which has 
either a plus sign or a period punchpd in column 1, 
indicating that its information is a continuation of 
whatever appeared on the foregoinq cai-d. Statements may ho 
broken anywhere; a blank is never assumed at the break. 



H. Program Text Representation 1^^ 



Ccjnment_CaTdSj_ Comments may be introdaced into the 
program with the use of ccininent cards, which are 
distinguished by having an asterisk in column 1, and any 
other information in the remaining columns. Comment cards 
may appear anywhere within the program deck except directly 
before a continuation card. Comments themselves may not be 
continued by placing a plus sign or a period in column 1. 

Listing Control.Cardg^ A card with a minus sign in 
column 1 is a" listing control card, used to specify the 
format of the listing which is produced by the compiler. The 
word appearing after the minus sign specifies what is to bo 
done to the listing, as follows: 

-SPACE Leave a blank line in the listing. 

-E.1ECT Print the next statement of the compiler 
listing at the top of a new page, 

-UNLIST Stop printing the statements of the program 
text until a listing control card specifying LIST is 
encountered, 

-LIST Resume printing the program text. 

Listing control cards, like comment cards, may appear 
anywhere within the program deck except directly before a 
continuation card. 

Jitgnile^ Sintax_of_Snobol_StateaentSj^ In addition to 
the forms used" for them in example program texts, certain 
language elements have alternative representations. 

Array Prototypes. Instead of colons in the argument of 
the ARRAY () procedure, slashes may be used. The rules 

LIST = ARRAY ('0:2, 0:3«) 

and 

LIST = ARRAY (♦0/2, 0/3M 

would assign identically-dimensioned arrays as the value of 
LIST. The PROTOTYPE () procedure returns colons in its 
canonical version of the prototype string, regardless of 
which character was used in the argument of ARRAY (). 

Item References. Instead of left and right brackets 

around the selector of an item reference, a combination of 

parentheses and adjacent slashes may be used. For example, 

LTS1[2,3 3 and LIST (/2, 3/) are alternative ways of writing 
the same item reference. 



H. Program Text. Representation 157 



Go-to Parts. Rather than a colon to introduce a go-to 
part, a slash may be used; but a slash used for this purpose 
must not be followed by a blank. Thus, 

VOWELS = TRIM (INPUT) : F(ERROR) 

and 

VOWELS = TRIM (INPUT) /F (ERROR) 

are equivalent statements. 

Instead of left and right brackets in direct go-to»s 
(used cnly in connection with objects of datatype Code), the 
parentheses and adjacent slashes notation may be used, in 
the same way as for item references. Thus, the two 
statements 

RESULT = CODE (TRIM (INPUT) ) : [RESULT] 
and 

RESULT = CODE (TRIM (INPUT) ) : (/RESULT/) 

are equivalent, as is 

RESULT = CODE {TRI!1 (INPUT) ) /(/RESULT/) 

Pattern ftlternations. The alternation operator may he 
written as tw6 adjacent slashes, bounded by blanks, instead 
of the usual single character. Thus, X 1 Y and X // Y may he 
written with the same effect. 

String Literals. Within string literals, all characters 
other than the quotation mark (single or double) being used 
as the delimiter of that literal may be used freely. The 
delimiter character may occur within the string only in 
pairs, and each such pair will be taken to represent a 
single instance of the character. For example, the rulps 
containing a single string literal each 

AWW = "••"AIl'SaWELL""" 
and 

AWW = "'ALL' •SnWELL"' 

are equivalent to the rule containing a concatenation of 
three string literals 

AWW = '"ALL* ""• 'SdWELL'" 

Any cne of them would assign to AWW the 12-character string 
"ALL'S WELL". 



1S8 



Appendix I. CHARACTER SET REPRESENTATIONS 

The Snobcl character set consists of sixty-three 
characters: the capital letters A-7,, followed by the digits 
0-9, followed by the remaining characters in the order 

♦ -*/() $ = □,.=[ ]: '-VIA •• + <><>-.; 

This ordering of the sixty-three characters is called their 
standard collating sequence. Fifty-four of these play a part 
in the syntax of the language (see Appendix J), and have 
equivalents in the reference symbol set used to construct 
program texts; the remaining nine characters may occur only 
in string literals or in data read from input files. 

Program texts in examples are shown in symbols from the 
reference set. For input each of these must be represented 
by a punched card code produced on a keypunch (either model 
026 or model 029) ; for output each will be represented by a 
character on a line printer. Each symbol of the reference 
set has a single card code, and a single printer 
representation. Each card code and printer representation 
corresponds to a single reference symbol, except for one 
special case: the blank used to separate language elements 
and the space character (n) used in literal data have the 
same card code and printer representation, although they are 
differentiated in the reference symbol set for clarity. 

The reference symbol set consists of the twenty-six 
capital letters, the ten digits, and nineteen special 
characters. Codes for the letters and digits are produced by 
the keys marked with them on both an 026 or an 029 keypunch, 
and all have the expected representation on a line printer. 

The special characters in the reference symbol set are 
shewn in the accompanying chart. On an 026 keypunch, codes 
for the reference symbols are produced by keys marked with 
the same symbols where they exist, but six symbols (:;"![]) 
have no keys and so they must be multiple-punched. (In 
. Sncbol expressions — not, obviously, in literal data — these 
six symbols may be avoided by using the extended syntax 
described in Appendix H.) On an 029 keypunch, codes for all 
but cne of the reference symbols (|) are produced by some 
key, but most of the keys are marked with different symbols. 
On a line printer, all but three of the reference symbols 
(••♦1) look like their counterparts in the reference set. The 
final nine characters in the chart are those without 
equivalent reference symbols. 



I. Character Set Representations 



15^ 



Snofccl 026 card line printer 
symbol key code character 



8-3 



(equal) 



Snohol 
usage 

assignment 



02<i 
key 



12-8-3 






condit. assiqn., 
naiue, real lit. 



0-8-3 



(comma) 



list 
separator 



none 8-2 



(colon) 



none 12-8-7 



(semicolon) 



go-to's, array 
prototypps 

statement 
terminator 



8-a # 

(not equal) 



none 11-8-5 + 

(up arrow) 



string literal a) 
delimiter 

string literal ) 
delimiter 



$ 11-8-3 $ 

(dollar) 



indirect ref . , $ 
iramed. assign. 



none 11-0 v 

(logical or) 



alternation none 



( 0-8-a ( 

(left paren) 



) 12-8-U ) 

(right paren) 



arg. lists, % 

expr. grouping 

arg. lists, < 

expr. grouping 



none 8-7 [ 

(left bracket) 



iteu ref . , 
direct go-to's 



1 nonP 0-8-2 ] item ref., 0-8-2 

(right bracket) direct go-to's 



11 



(minus) 



negative, 
subtraction 



12 ♦ 

(plus) 



positive, 
addition 



I. Character Set Representations 



160 



Snofcol 026 card line printer Snobol 029 

symbol key code character usaqe key 

4, ^f 1 1-8-4 ♦ deferred eval. , * 

(asterisk) multiplication 



/ 


/ 


0-1 


/ 
(slash) 


division 


/ 


blank 


space 
bar 


blank 


(space) 


concatenation, 
separator 


space 
bar 


D 


space 
bar 


blank 


(space) 


data only 


space 
bar 




none 


0-8-6 


(identity) 


data only 


> 




none 


0-8-5 


(right arrow) 


data only 


- 




none 


0-8-7 


A 

(logical and) 


data only 


7 




none 


11-8-6 


(down arrow) 


data only 


• 




none 


12-0 


< 
(less than) 


data only 


none 




none 


11-8-7 


> 

(greater than) 


data only 


-» 




none 


8-5 


< 

(less or equal) 


data only 


« 




none 


12-8-5 


> 

(greater cr equal) 


data only 


( 




none 


12-8-6 


"1 

(logical not) 


data only 


+ 



161 



Appendix J. SYNTAX OF PROGRAM TEXTS 



1. <string literal> : := 

• <string format 1> • f 
" <string format 2> " 

2. <aigit string> ::= 

<digit> I 

<diqit strinq> <digit> 

3. <integer literal> ::= 

<digit string> 

U. <real literal> ::= 

<diglt string> . | 

. <digit string> | 

<digit string> . <digit strinq> 

5. <literal> ::= 

<string literal> | 
<integer literal> | 
<real litf^i:al> 

6. <identifier> ::= 

<letter> | 

<identifier> <letter> | 

<identifier> <diqit> | 

<identifier> . 

7. <siinple variable> :: = 

<identif ier> 

8. <subscript list> ::= 

<expr<»ssion> | 

<subscript llst> <,> <expression> 

9. <array item reference> ::= 

<simple variable> <[ > <subscript list> < ]> 

10. <procedure identifier> ::= 

<identif ier> 

11. <arqumpnt list> ::= 

<optional expresion> | 

<arguraent list> <,> <optional GxprGssion> 



162 
3. Syntax of Program Texts 



12. <procedure refGrence> ::= 

<procedure identifier> <(> <arquBent Ust> <) > 

13. <variable> ::= 

<siiaple variable> | 

$ <primary> I 

<array item reference> | 

<procedure reference> 

14. <primary> : := 

<literal> \ 

<variable> | 

. <variable> | 

< (> <expression> <) > 

15. <factor> :z- 

<pclraary> I 

<factor> <blank> ♦* <fclank> <primary> 

15. <niultiplying operator> :: = 
<blan1c> * <blank> I 
<blank> / <blank> 

17. <term> ::= 

<factor> I 

<terni> <multiplYin<J operator> <factor> 

18. <addiug operator> ::= 

<blank> * <blank> 1 
<blank> - <blank> 

19. <sua> ::= 

<tGrm> I 

♦ <terni> J 
- <terra> | 

<suin> <adding operator> <terra> 

20. <concatenation> ::= 

<suin> I 

<concatenation> <blank> <suni> 

21. <expression> ::= 

<concatenation> 

22. <deferred pattern> ::= 

* <variable> 



J. Syntax of Program Texts ^^3 



23. <pattern assignment operator> ::= 
<blank> $ <blank> | 
<blank> . <blank> 

2U. <pattern assignment> ::= 

<pattern priinary> <pattern assignment operator> 
<vari3ble> 

25. <pattern primary> ::= 

<literal> | 

<variable> | 

. <variable> | 

<deferred pattern> j 

<pattern assignment> | 

< (> <pattern expressicn> <) > 

26. <pattern factor> ::= 

<pattern primary> | 

<pattern factor> <blank> ♦* <blank> <pattern priaiary> 

27. <pattern term> ::= 

<pattern factor> | 

<pattern term> <multiplyinq operator> <pattern factor> 

28. <pattern sura> ::= 

<pattern tGrm> | 

+ <pattern term> | 

- <pattern term> f 

<pattern sum> <adcling operator> <pattern terni> 

29. <pattern concatenation> : := 

<pattern sum> | 

<pattern concatGnation> <blank> <pattGrn sura> 

30. <pattern alternation> ::- 

<pattern concatenation> | 
<pattern alternation> <blank> <i> <blank> 
<pattern concatenation> 

31. <pattern expression> ::= 

<pattern alternation> 

32. <opticnal Gxpression> ::= 

<null.> I 

<pattern expression> 

33. <label> ::= 

<identifior> 



16 '4 

J. Syntax of Program Texts 



3U. <labGl part> : := 
<null> I 
<label> 

35. <right siae> : := 

<=> <opticnal expression> 

36. <rule part> ::= 

<null> I 

<blank> <Friinary> I . v . 

<blanTc> <primary> <blank> <pattern expression> I 
<blank> <variable> <right si'1e> 1 
<blanlc> <variable> <blank> <pattern expression> 
<right sifle> 

37. <loc> ::= <location Gxpression> ::= 

<(> <label> <)> I 

< (> $ <pri!nary> <) > I 

<[> <expression> <]> 

38. <go-to part> ::= 

<null> 1 

<:> <loc> 1 

<:> 5 <loc> I 

<:> F <loc> 1 

<:> S <loc> <optional blank> F <loc> I 

<:> F <loc> <optional blank> S <loc> 

39. <statemant> ::= . ^,. 

<label part> <rule part> <go-to part> 

HO, <prograin text> :: = 

<statement> | 

<program text> <;> <stateinent> 

m. <letter> ::=^ , C | D J E | F 1 r, | H | I I J I K 1 L 1 M 1 
N 1 I P I Q 1 R I S 1 T 1 1 V I W I X I Y 1 Z 

tl2. <digit> ::= , „ . « 

I 1 I 2 1 3 J U I 5 I 6 I 7 I 8 I 9 

U3. <blank> ::= 

a I <blank> a 

UH. <optional blank> ::= 
<null> 1 
<blank> 



J. Syntax of Program Texts 



165 



45. <string format 1> ::= 
<null> I 
<string format 1> <class 1 character> 

16. <class 1 charcicter> :: = 

<any character except •> j «• 

£1*7 ^c*-riT\rt ^nr-Kt-^t- TV .._ 

<null> I 

<string format 2> <class 2 character> 

U8. <class 2 character> ::= 

<any character except "> | •"• 



= { <opticnal blank> 

= <optional blank> ) 

= [ <optional blank> | 
{/ <optional blank> 



<]> ',:- <optional blank> ] ] 
<optional blank> /) 

<|> : := <tbe character |> j // 

<:> ::= <optional hlank> : <optional blank> | 
<optiona.l blank> / 



H9. 


<{> : 


50. 


<)> : 


51. 


<[> : 



52. 

53. 
SU. 

55. 
56. 
57. 
58. 



<»> : 
<=> : 
<:> : 



= <optional blank> , <optional blank> 
- <optional blank> = <oFtional blank> 
= <optional blank> ; 



<null> ::= 



166 



Appendix K. SUI^HARY OF CCMPIIE-TIME ERROR MESSAGES 



Each statement which is syntactically incorrect is 
marked in the program listing by an up arrov which is 
printed beneath its statement number along with the message 
IRRCR. It is planned that in the future a specific message 
for each particular type of syntactic error will be 
provided. 



167 



Appendix L. SUMMARY OF EXECUTION-TIME ERROR MESSAGES 

When an error is detected during the execution of a 
Snobol proc^rani, the Snobol interpreter writes a message on 
the output file and then ceases execution. The message 
consists of three parts: (1) the identifying number of tho 
statement being executed when the error was detected (each 
statement of the program text is given a number by the 
compiler, and these numbers appear at the left of the 
statements in the compiler listing of the program text); (2) 
the level of procedure execution at the time the error was 
detected (tho same information which would be returned by 
the predefined procedure FNCI.EVEL () ); (3) one of the error 
messages from the list below, specifying which of the fifty- 
two possible errors was detected. 

Some of the messages in the following list are self- 
explanatory. Notes have been added to many messages 
amplifying them, or explaining terminology which differs 
from that used in this description of Snobol, or 
reccmmending page numbers and sections where further 
information relevant to the interpretation of the message 
can be found. 



THE LEFT OPERAND FOR A PATTEFN HATCH MUST BE A STRING. 

ii.u nj.itni urcnAHu run A fflTTKKW HATCH MUST BE A 
PATTERN. 

PATTERN MATCH WITH REPLACEMENT REQUIRES STRIN G-VALUEO 
RIGHT HAND SIDE. 

TRANSFER TO AN UNDEFINED LABEL A go-to specifies a 
transfer to a label which is not present in the program 
te;tt, and which is not RETURN, FRETURN, NRETURN, or END.' 

A FAILURE OCCURRED IN THE EVALUATION OF THE GO-TO 
PART. Conditions which would cause failure in the rule 
part of a statement cause an error in the go-to part (see 
page 68) . 

TYPE ERROR IN GO-TO PART. Either the operand of an 
indirect referencing operator in the go-to ir> not a string 
or a Name (see page 67), or else the value of the expronsion 
in a direct go-to is not an object of datatype Code. 

FORBIDDEN OPERAND TYPE FOR ALTERNATION. Operands of 
the alternation operator must be of datatype String, 
Integer, or Pattern (see page 35). 



L. SuBoary of Execution-time Error Messages 168 



TFE DATA TYPE USED HAY ONIY BE CONCATENATED WITH THE 

NULL STRING. .'=;trinqs. Integers, and Patterns may be 

concatenated freely. An object of any other datatype may be 
concatenated only with the null value. 

TFE VALUE OF A VARIABLE IN A DEFERRED- EV ALUATTON 
PATTERN (UNARY *) MUST BE A PATTERN OR STRING. See the 
description of the deferred evaluation operator, page 50. 

LT^FT OPERAND FOR BINARY $ ANC . MUST BE A PATTERN. 
See the descriptions of the immediate and conditional 
assignitent operators, pages 38 and UO. 

INDIRECT REFERENCE TO THE NULL STRING. The operand of 
the indirect referencing operator may not be the null value 
(see page 51) . 

OPPRAND FOR INDIRECTION HUST BE NAME OR STRING. The 
operand of the indirect referencing operator must be a 
string or a Name (see page 57). 

NON-INTEGEF STRING USED IN NUMERIC COKTF.XT. Only 
strings of datatype Integer -- these consisting of an 
optional sign folloned by an optional string of digits -- 
may be used where Integers are expected. 

TYPE ERROR IN NUMERIC CONTEXT. An object of either 
datatype Integer or Real was expected, but an object of some 
other datatype occurred. 

DIVISION BY ZERO HAS ATTEMPTED. 

STRING ARITHMETIC NOT YET tMPlEMENTED. ;j'|^^^®", "f^ 
have values of magnitudes as large as IQi^oooo, but the 
arithmetic operations are defined only for integers of 
magnitudes less than IQio. It is intended that the 
arithmetic operations should he extended to integers as 
Urge as can be represented, by performing "string 
arithmetic" on the digit strings of which they are composed. 

REAL ARITHMETIC OVERFLOW. A real number larger than 
can te represented has been produced (about lO^oo) . 

MIXED MODES (INTEGER, REAL) FOR ARITHMETIC OPERATION. 
The operands of arithmetic operators (and the arguments of 
predefined arithmetic test procedures) must be of the same 
datatype. If operands of different datatypes are to be 
operated upon, one must first be converted (see the 
description of CONVERT () in Appendix A, section II .C). 



L. Sumiriary of Execution-time Error flessages 169 



WRONG PARAMETT-R TYPE FOR STANDARD PROCEDriRE. An 
argument of a predefined procedure is of an incorrect 
datatype. Permissible datatypes of arguments for all 
predefined procedures are given in Appendix A. 

ARGUMENT FOR LEN, POS, RPCS, TAB, OR RTAB MHST BE IN 
TFH INTERVAL [0,2**17-1]. The integer arguments to these 
five pLwdefinen pattern procedures must be non-negative, and 
must be less than 131,072. 

SYNTAX ERROR IN STRING TO BE COMPILED. An argument 

string for the CODE() procedure is incorrect; see the 

description of CODE() in Appendix A, section II. C, and the 
Syntax of Program Texts in Appendix J. 

INCORRECT SYNTAX FOR STRING TO BE CONVERTED TO REAL. 
See the description of CONVERT () in Appendix A, section 

IMPROPER ARGUMENT FOR PSETJDC-FIELD FUNCTION (FIRST, 
REST, LEFT, RIGHT, PARAM, FAMILY, OR SELECTOR) . The 
arguments of the predefined field selection procedures 
PARAMO, FIRSTO, REST () , LEFT(), RIGHT (), FAMILY (), and 
SELECTOR {) are quite specialized; see the descriptions of 
these procedures in Appendix A, section I.e. 



CALL OF AN UNDEFINED PROCEDURE. The DEFINE() 

declaration -f nr a nr-/-i/-ir-^T<ir«^v._/4^«j^„j „ _i _.._i. i 

•ji %. V* J. »i J. i-i I. a. »- . . ^ >^ t ,_! fj. v/^J I. aiimi ^1. - uri 1. X 11 1; u f'LUU*3(tur9 mUSt ue 

executed before it can fce invoked (see page 72) . 



SYNTAX ERROR IN PROCEDURE PROTOTYPE. There is an 
error in the form of the string which is the first argument 
of the DEFINE procedure (see page 72). 

RETURN FROM LEVEL ZERO. A transfer to RETURN, 
FRETURN, or NRETURN has been executed in a main program (see 
page fl7) . 

AN -NEETURN- WAS EXPECTED FROK THE PROCEDURE CALLED. 
A procedure call occurs where a variable is required, but 
the procedure does not return by NRETURN; see the 
description of NRETURN, page 90. 

A PROCFDUPF RETURNING BY -NRETURN- MUST SUPPLY A NAME 
AS ITS VALUE. When a proced\iro return.'; by NRETURN, Mio 
value of tlio result variable must he a r.tring or an obiect 
of, datatype Name; see the description of NRETURN, page 90. 

VARIABLE TO THE LEFT OF A f DOES NOT CONTAIN AN 
ARRAY. The value of the family part of an item reference 



L. Summary of Execut.ion-tJ me Error Messages 170 



is not of (datatype Array. See the description of item 
references, page 101. 

TOO MANY SUBSCPTPTS IN AN ARRAY REFERENCE. There are 
more index exprcsr-icns in the selector of an item reference 
than there are dimensions defined for the family being 
indexed. See pages 106 and 10<?. 

TOO ^EW SnBSCRIPTS IN AN ARRAY REFERENCE. There are 
fever ind<-x expressions in the selector of ar. item reference 
than there are dimensions defined for the family being 
indexed. See pages 106 and 109. 

ILLEGAL CHARACTER IN ARRAY PROTOTYPE. See the 
description of the argument for the ARRAY () procedure, page 

loa. 

SYNTAX ERROR I'^ ARRAY PROTOTYPE. See page 104. 

LOWER BOUND GREATER THAN OPPER BOUND IN ARRAY 
PROTOTYPE. See page 10U. 

AN ARRAY BOUND WAS TOO LARGE. An expression for an 
upper or lower bound in an Array prototype was greater in 
magnitude than 131,071. 

AN ARR?Y DIMENSION WAS TOO LARGE. The difference 
between any pair of upper and lower bounds was greater xn 
magnitude than 131,071. 

AN ARRAY MUST CONTAIN FEWF.H TRAN 2**17 ELEMENTS. A 
prototype string for the ARRAY () procedure specifies an 
array containing more than 131,071 items. 

SYNTAX ERROR IN SELECTOR FOR ITEM {) . See the 
description of the ITEM () procedure, page 108. 

SYNTAX ERROR IN DATA PROTOTYPE. See the description 
of the argument of the DATA() procedure in Appendix A, 
section II. A. 

DUPLICATE NAMES IN DATA PROTOTYPE. Two fields defined 
for objects of a single datatype may not have the same name, 
nor may a field name be the same as the datatype — 
otherwise all the necessary procedures could not exist 
simultaneously. See the description of DATA () in Appendix A, 
section II. A. 

DATA CONSTRUCTOR CANNOT SUPILY A NAME. Structure 
creation procedures, predefined or programmer-defined, do 



L. Summary of Execution-time Error Messages 171 



not return Names, but rather objects of datatype Array or of 
a prcgrammer-def ined datatype, respectively. 

THE PABAHETER FOR A FIELD FUNCTION WAS NOT A DATA 
REFERENCE. The argument of a programmer-defined field 
selection procedure was not an object of a programmer- 
defined datatype. 

NO SOCH FIELD IN THE REFERENCED DATA STRUCTURE. The 
structure which is the argument of a programmer-defined 
field selection procedure does not contain a field 
identified by that procedure name. 

FILE SPECIFIED TO I/O PROCEDURE MUST BE CURRENTLY 
ATTACHED, The filesets named by the arguments of 
ENDGROUPO,. REWIND {), EORLEVEL () , and EOI {) must be 
currently associated with some variable (see Appendix A, 
section II. D) . 

ILLEGAL FILENAME GIVEN TO I/O ASSOCIATION PROCEDURE. 
A legal SCOPE fileset name is a string of one to seven 
letters and digits, beginning with a letter (see Appendix A, 
section II. D) . 

ATTEMPT TO READ FAST END-OF-INFORMATION. See thf* 
descriptions of EORLEVEL() and EOI() in Appendix A, section 
II. D. 

STRING TO BE DISPLAYED WAS LONGER THAN 80 CHARACTERS. 
The string which is the argument to the REMARK () procedure 
must contain 90 or fewer characters. 

ONLY STRINGS HAY BE OUTPUT. A value of a datatype 
other than String or Integer was assigned to a variable 
which currently has an output association. 

THE MAXiriUM FIELD LENGTH HAS EFEN EXCEEDED. The 
program requires more storage to execute than was requested. 

THE MAXIMUM STRING LENGTH HAS BEEN EXCEEDED. See the 
description of MAXLNGTH () in Appendix A, section II. D. 

TFE STATEMENT LIMIT HAS BEEN EXCEEDED. See th^ 
description of STLIMITO in Appendix A, section IT.B. 

COMPILER STACK OVERFLOW, SIMPLIFY THE CONSTRUCTION. A 
storage area for intermediate results in the Snobol compiler 
has been exhausted. The statement should be rewritten as two 
or more statements, since it contains too many levels of 
nested parentheses. 



172 



Appendix M. Non-standard Features of Berkeley Snobol 

The initial desiqn and implementation of SnobolU was 
done at Bell Telephone Laboratories for IRM r.ystem 360 
machines. The latent version of this implementation is 

descrifced in The SNOROL^ Programming Lin^uacie by P, E. 

Griswold, J. F.~Poage, and I. P. Polonsky (second edition, 
Prentice-Hall, 1971). This book contains many interesting 
examples and should be of use to all serious Snobol 
programmers, even those who are working with non-standard 
implementations for different machines. 

The impleniGntation described here was produced at the 
Computer Center of the University of California at Berkeley 
by Paul KcJones and Charles Simonyi for CDC 6000 series 
machines. The language they implomented, which we shall call 
the Berkeley version, is non-standard since it differs from 
the Bell version in three basic ways: some features of the 
language are handled differently, some features are absent, 
and seme new features not present in the Bell version are 
provided. This appendix describes the differences between 
the Bell version and the Berkeley version, presenting the 
information in terms of these three types of differences, it 
is provided to make this more comprehensible description of 
the Snobol language useful to those writing programs in the 
Bell version, and to specify which parts of the Bell 
documentation are useful for those writing programs in the 
Berkeley version of the language. 

Quite apart from differences between the two versions 
of the Snobol language, there are some differences in 
terainclogy between the documentation of Griswold, Poage, 
and Polonsky, and the present description. The pairs of 
terms in the following table are equivalent, and represent 
differences in the descriptions only, not in the language 
versions described. 



Bell description this_ description 

priiritive predefined 

defined programmer-defined 

function procedure 

predicate test procedure 

value of function name value of result variable 

fortpal argument formal variable 

local variable internal variable 

function procedure procedure body 

entry point entry label 



W. Non-standard Features 



17 3 



Bell •description 

explicit name 
created name 
implicit naniR 
generated variable 
aggregate 

referoncina aranment 
array element 
array reference 
field function 
source program 
statement component 
subject (assignment) 
subject (pattern match) 
object 

compilation error 
program error 



this - de scription 

string name 

Name 

Name 

indirect reference 

family 

selec tor 

array item 

item reference 

field selection procedure 

program te>:t 

statement part 

left side 

string reference 

right side 

compilc-time error 

execution-time error 



I , l!S3t!lI£s_ whi ch_ar e_Hand l2d_Di f f eren t IX 

Plo£Pi3!ii:es_^ In the Bell version, it is an execution- 
time error to call a predefined procedure with more 
arguments than its definition prescribes; in the Berkeley 
version, extra arguments to all procedures are evaluated but 
otherwise ignored. 

Since the character sets of lEri System 360 machines and 
CDC 6000 series machines are different, the ALPHAPRT() 
procedure, which returns a string specifying the character 
set in standard collating sequence, necessarily returns a 
different string in the two versions. (This procedure exists 
as a keyword in the Bell version.) 

Since the Bell system uses FORTRAN TV I/O, and the 
Berkeley system does its own I/O, the INP'JT() and OUTPUT () 
procedures require quite different sorts of arguments. 

The ARRAY procedure has two arguments in the Bell 
version, the second specifying an initial value to be 
assigned to all items of an array. In the Berkeley version, 
the ARRAY procedure has one argument only; all items are 
initialized to the null value. 



Since numeric strings are of datatype Integer in the 
Berkeley version, IDENTC't M) succeeds while in the Roll 
v*»rsion it fails. in the H«ll version, patterns arr* 
conFidered identical only if they are indeed the samp 



M. Ncn-standard Features 



pattern. Thus 

X = A I B 
Y = A I B 
IDE NT {X,Y) 



I7tt 



fails Fince two different copies of the pattern are being 
cc.pared. In the Berkeley version this co.pa^xson would 

sucLed since ^^^^^f ^ /, J , ^f, .^PM alf'L t^e 
considered identical. iDtNH.vfl.>, v«r. » 

Berkeley version while it succeeds in Bell owina to the 
dlffJrelt r.ple.entations of the Name operator (described in 
the section on operators below). 

The CODEO procedure in the Berkeley version does not 
allow labels to be redefined; consequently the labels of the 
statements which are to be added to the P'^^g""'^ ^urinq 
Jxecntion must be different from any existing labels of the 
program. 

The Bell version provides more datatypes than does the 
Berkeley version and Lch more flexibility about converting 
from onl datatype to another. In the Bell version, the 
CONVER^O procedure which is used f°^^-,,f ^P^^^.f^f ^^.^ 
arguments; the second argument specifies the Jj^^^^P. ;° 
^hich the first argument is to be converted, ^"^^e Berkeley 
version the CONVERT () procedure has only one argument since 
onlv a limited kind of conversion is available. Tf the 
siigle argument, of CONVERT is a numeral string or an 
inteqer it is converted into a real number; if the single 
argument is a real number, it is converted into a string. 

ocerators. The interrogation operator (?) has been 
imple^S?i|--ts the IF () procedure (see Appendix A, section 
II. C). 

The unary operator * is called in the Bell version the 
unevaluateS expression operator, and expressions introduced 

brit are of datatype Expression. This ^P^'^^^J^. ^^ ^^f.^jf, 

moro narrowlv in the Berkeley version. It is called tne 

^rerrfd^eralLtlon operator, an'd .ay be ^^^^^ll^^H.^HUll 

variables only; thus *EQ(X,Y) ^^'^^^^^/^J^^^^f "°^ 'tJe 
error. The datatype Expression is not defined ^^ ^h^ 
RerkGlev version; expressions introduced by the ^eterrea 
evaluation operator are of datatype Pattern. Hence LEN(*V 
causes an execution-time error since the argument of LEN () 
cannot be a Pattern. 

In the Bell version when the name operator is applied 
to a natural variable it returns an object of datatype 



M. Non-standard Features 175 



String, but when applied to a created variable it returns an 
object of datatype Name. In the Berkeley version, the name 
operator always returns an object of datatype Name. 

In the Bell version the multiplication operator has 
higher precedence than the division operator; in the 
Berkeley version the precedence is the same. 

lifiwoii/.fi There are no keywords in the Berkeley version 
(and hence no keyword operator) . Some of the Bell keywords 
assume the forn: of procedures; these are listed in the table 
belcw. 



Bell_versiori BeiJS§lei_version 

SALFHABET ALPHABET() 

SANCnOF ANCHOR 

SFNCLEVEL FNCLEVEL () 

RHAXLNGTH HAXI.NGTH () 

5STC0DHT STCOUNTO 

SSTITMIT STLIMITO 

These procedures are described in Appendix A, section II. 

I:5iaiLlIi£Si ^n the Berkeley version, numeric strings ar'^ 
of datatype Integer. Numeric strings may have an initial 
sign and hence the single characters 'f and »-• in 
isolation have the datatype Integer and have the value zero 
when used in arithmetic contexts. Correspondingly, the null 
value is of datatype Integer. In the Bell version, the null 
value is called the null string and is of datatype String. 

^J!St£J!i_Ii;i5B2l2E§^ Tn the Berkeley version, PF'^'IIRN, 
FRETORM, NEETURH, and END are treated as system transfers, 
having the name predefined meanings as in Bell. They may he 
used as any other labels in the program text, however, in 
which case the special system meaning is lost. 

2ilil2!ii.*. Objects of datatype other than String or 
Integer cannot bo printed in the Berkeley version, and an 
attempt to print such a value results in an execution-time 
error. In the Bell version an attempt to print such a value 
results in the printing of a string designating the datatype 
of the value. 

Ansigning tho variable OUTPUT a valuo of moro than 112 
characters in the Berkeley version repultr. in only tlio first 
132 being printed (a single line) ; in t!ie Bell i/f'rsion, as 
many lines as necessary are printed. 



17 6 
M. i:on-standarfl Features 



Program Peprespntation^ There are a number of small 
fliff^^rincii "in "he way that prcqrams may be represented; 
roost consist of extra optional features which have been 
added to the Berkeley version. 

In the Berkeley version, the assignment siqn (=) need 
not he bounded by blanks: similarly, the colon introducing a 
go-to need not be preceded by a blank. 

In the Berkeley version, the quote sign used as a 

literal delimiter may appear within that literal m pairs; 

earh pair is then treated as representing a single quote. 

Thus •CON«'T« may be used to represent the string DON T. 

in the Berkeley version, statements continued over line 
boundaries may be broken anywhere; a blank is never assumed 
at the point of the break. In the Pell version, statements 
may be broken only where a blank is required. 

in the Berkeley version, real literals need not begin 
with digits (that is, they may begin with an initial decimai 
point) . 

In the Berkeley version it is not necessary to 
terrcinato a program text with a statement ^^^'''^l^^ .'''''' ^l,]^ 
.•s in the Bell version. The program may terminate by taking 
a' transfer to END, if no END label is present. END may he 
used as a label in a program text in which case it then 
loses its system significance, and a program containing an 
END label can terminate only by runninq out of proqram text 
this is not an error as it is in Bell (see Chapter ?) , In 
the Berkeley version it is not possible to specify by use of 
an INC statement which statement of the program is to be 
executed first; execution always begins with the fir.t 
statement of the program text. 

Alternative characters may be used in the Berkeley 
version to represent some of those which must otherwise be 
multiple punched on an 026 keypunch. Thus the go-to may be 
introduced by either a colon (:) or a slash (/). d^ ^he 
slaFh is used it must not be followed by any blanks as it 
might then be indistinquishable from the binary division 
operator.) The colon used as a delimiter between the upper 
and lower bounds of an index in forming the P^°*°^yP^ °J . ^jj 
array may also be represented by a slash. The alternation 
operator (!) may be represented by two slashes (//) and the 
square brackets of an item reference may be ^^P^^^'^^^f _ t'{ 
(/ for an open bracket and /) for a close bracket. The Beli 
version does not provide any of these particular oP^ions 
bu<- has a different extended syntax to take advantage of 



H. Non-standard Features 177 



special characters available on the IBM 360; lower case 
letters are also available. 

The representation of labels is freer in the Bell 
version than in the Berkeley version. In the Bell version a 
label may consist of a letter or a digit followed by any 
number of other characters from the entire character set 
excent blank. In the Berke3.ev version a label must be an 
identifier; that is, it must begin with a letter and consist 
of nothing but letters, numbers, and periods. 

The Program Listing. In the Beikeley version, columns 
72 and 73 of the program text are separated hy ten spaces in 
the output listing. The statement numbers always appear to 
the left of the statements. In the Bell version the 
statement numbers noririally appear to the right of the 
statements, but it is possible to specify that they appear 
to either the left or the right. This is done hy writing the 
terms LEFT or RK'HT following the listing directive Llf^T; 
the, default option is FIGHT. There is no way to specify that 
the statenonts should be numbered to the right in tho 
Berkeley version. 

In the Berkeley version the listing directive SPACE has 
been added to cause one blank line to appear in the listing. 



I^ • Features . A bsent, f rom^ the_BerkeleY_yersion 

Procedures. The following procedures are available in 
the Bell version but not in the Berkeley version. Unless 
otherwise indicated, their actions cannot be simulated. 

ARG() returns the name of the n-th argument in the 
declaration of a programmer-defined procedure. 

BACKSPACE backspaces a file one logical record. 

CLEAR () causes all natural variables to be assigned the 
null value. This procedure can be written in Berkeley Snobol 
using NEXTVAR () . 

CCILFCTO forces a storage regeneration. (Not needed 
since storage regeneration occurs automatically.) 

COPY() produces a copy of an array or a data structure. 
It can be written in Berkeley Snohcl using ITEM() for arrays 
(see Chapter 7), and APPLY () for data structures. 



M. Non-standar<l Features "''^^ 



DUHPO produces an unalphabet ize 1 list of all non-uull 
natural variables and their values. It can be written in 
Berkeley Snobol using KEXTVAR(). 

DOPLO returns a string consisting of n duplications of 
one of its arguments. It is virtually the same as the 
prcgrairmer-def ined procedure REPEAT () given in Chapter fi. 

EVALO returns the result of evaluating a string which 
is a Snobol expression or an object of datatype Expression. 

FIELD () returns the name of the n-th field in the 
declaration of a programmer-defined datatype. It can be 
written in Berkeley Snobol, because the Berkeley PROTOTYPE {) 
procedure may be applied to structures (see Appendix A, 
Section II. B). 

INTEGER succeeds if its argument is an integer. It 
can be easily written as 

IDF.NT {DATATYPE (ARG) ,« INTEGER') 

(In the saaie way, any other test procedure for testing 
datatypes may be written.) 

LOADO causes an external function to be loaded from 
the library during execution. 

LOCAL returns the name of the n-th local (internal) 
variable of a programmer-defined procedure. 

0P5YN () allows the programmer to specify synonyms for 
procedures or operators. Thus the same procedure may he 
referred to by more than one name and the same operator by 
more than one symbol. In addition, operators and procedures 
may be made synonymous; thus this procedure makes possible 
the definition of new operators. 

REHDRO returns the integer remainder of dividing its 
first argument by its second. This can be written in Snobol 
as a programmer-defined procedure employing nothing but 
arithmetic operators. 

REPLACE () returns a string in which every character of 

one argument has been replaced by a corresponding character 

of another argument. It can be written as a programmer- 
defined procedure in Snobol, 

STOPTRO cancels the tracing of the variable named by 
its argument. 



H. Ken-Standard Features 17Q 



TABLE () creates a family of variables, similar to a 
one-dimensional array except that individual variables may 
be selected in terras of any data object, not just integers. 
This datatype is not defined in the Berkeley version, but 
table-like structures can be formed using indirect 
referencing if the selector is a string, 

TRACE initiates tracing of the variable named by its 
argument. 

HNLOADO causes the unloading of a^ external library 
function which is no longer needed. 

VALUE has the same effect as the indirect referencing 
operator when applied to a String or a Name, hut if VALUE 
has been defined to be a field of a structure, then it may 
have an argument of that datatype as well. 

2i£ISi2ISi The following operators are not available: 

negation (-•) 
cursor position (8) 
exponentiation (*♦) 

The negation operator fails if its operand succee'ls, 
and succeeds if its operand fails. (Its counterpart, th 
interrogation operator (?) , which always succeeds, has been 
implemented as the IF() procedure.) 



o 



The cursor position operator has a variable as its 
operand and is used within the pattern part of a rule. The 
variable is assigned, by immediate assignment, an integer 
representing the position of the cursor when pattern 
matching occurs. Thus 

•ABC 'B' aPCINTER 

causes POINTER to be assigned successively the values 
and 1. 

Kfy5t2£dSi "^^^ Berkeley version of Snobol contains no 
keywords. Some keywords have been iitpleraented as predefined 
procedures, as indicated in Section T of this appendix; the 
remaining keywords, listed below, cannot be simulated, 
although sometimes a similar effect may be achieved throuqh 
other means. Those whose values are protected (l.«=. , cannot 
be changed directly by the programmer) are marked witli du 
asterisk. 



n. Non-standard Foatures ^^^ 



CABEND is used to specify vhratbcr or not a system cor-^ 
dump is to be printed at program termination. 

6AB0PT has the same value as that of the predefined 
pattern ABORT. (*) 

r^ARB has the same value as that of the predefined 
pattern ARB. (*) 

&BAL has the same value as that of the predefined 
pattern BAL. {*) 

FrCOD?. can be assigned an integer which will be returned 
to the operating system as the user completion code at 
prograu termination. 

gDUMP is used to specify whether or not a dump of th? 
natural variables is to be printed at program termination. 

5ERRLIMIT has a value which controls the handling of 
certain program errors, 

gERRTYPS acquires an integer code identifying the type 
of any program error which may occur, (*) 

SFAIL has the same value as that of the predefined 
pattern PAIL. (*) 

SFENCE has the same value as that of the predefined 
pattern FENCE. (*) 

SFTRACE is used to specify whether or not diagnostic 
tracing information is to be provided on calls to and 
returns from all programmer-defined procedures. 

&FtILLSCAM is used to specify whether or not the 
fullscan mode of pattern matching (in which no heuristics 
are employed) is to be used. 

SINPUT is used to specify whether or not any input is 
to cccur. 

eiASTNO acquires as its value an integer specifying the 
statement number of the previous statement executed. (*) 

CCOTPUT is used to specify whether or not any output is 
to occur. 



(1. Kcn-stanrlacd Features 181 



CREM has the same value as that of the predGfinpfl 
pattern RE". (*) 

5RTNTYPE acquires as value the string RETURN, FRETURN, 
or MPETURN, depending on the type of return made by the last 
programmer-defined procedure which returned. (*) 

esiFCOilNT acquires as value an integer specifying how 
many statements have failed, (♦) 

SSTNO acquires as value an integer specifying the 
statement number of the statement currently being executed. 

SSncCHED has the same value as that of the predefined 
pattern SUCCV>':d. (*) 

RTRACE is used to specify whether or not tracing is to 
occur. 

RTRIM is used to specify whether or not all trailing 
blanks are to he trimmed on input. 

E.3ti2£!l V aria hles. "^he predefined pattern variable 

SOCCFED, which always matches the null value (and which has 
very limited practical application) is not available. 

Dat atyp es. The following datatypes do not exist in the 
Berkeley version: 

Table (see the description of the TABLE () procedure 
above) 

Expression (see the., description of deferred 
evaluation in section I of this appendix) 

External, which refers to external library functions 
(see the description of the LOAD () and rjNLOAD() procedurer; 
above) . 

PatlSin iHatSliiiiai. There is no quickscan mode of 

pattern-matching (a mode which makes use of heuristics) . 
This is the noriral mode in the Bell version, while fullscan 
is the normal mode in the Berkeley version. 

A^illiniGtlSi Mixed mode arithmetic or comparisons 
(involving integers and real numbers) are not p'^rmitted. 

DUlEllti TJif^ variable PUNCH has a predefined association 
with the punch file in the Bell version; this is not true of 
the Berkeley version, hut the association can be made by 



n. t^on-standard Featurer, "'^2 

siniply executinq the rule 

OUTPUT (• PUNCH' ,' PUNCH') 

The Berkeley version currently provides no compile-timo 
errcr messages and no program statistics. As is indicated by 
the foregoing, it also provides no tracing facilities and no 
dump. 

III. ZGjitures_jiot_Pr2SSIll_iiL_ill£_^£ii-i2E5-i2Bi 

Procedures^ The following predefined procedures have 
been added "to the Berkeley version; all are described more 
fully in Appendix A. 

CLOCK returns the 2U-horir time of day (e,g, 
ITiOO:*^"?). (See Appendix A, section IT.B.) 

TYPE() returns the same result as DATATYPE () for 
obj€Cts of predefined datatypes, and the string nATA for all 
objects of programmer-defined datatypes. (See Appendix A, 
section IT.'B,) 

I'fEMO has been made more flexible and mote useful in 
the Berkeley version than it is in the Bell version. Tt is 
described in detail in Chapter 7. 

PFOTOTYPE () has been significantly extended so that it 
may be applied to structures. Patterns, and Nanies, as v?ell 
as tc Arrays. (See Appendix A, section II. B.) 

A number of field selection procedures have been added 
for use in conjunction with the systems-defined "prototypes" 
of Patterns and Names which are returned by the PPOTOTYPKO 
procedure. The procedures PARAHO, FIPST(), PRST{), LEFT(), 
and PTGHTO may be used to decompose Patterns into the 
objects from which they were constructed. A similar service 
for Names is provided by the procedures RIGHTO, FAMILYO, 
and S5IECT0P.(). (See Appendix A, section I.C.) 

NEXTVA^O returns the names of all members of any 
faiily cyclically, treating the set of all non-null natural 
variables as a "family." (See Appendix A, section TLB.) 



1 m 



INDEX 



ABORT, 151 
Addition, 19 
ALPHABEtO, 140 



Alternation, 35 

ANCHOR {) , 43, 145 

Anchored pattern 

matching, 43, 46 

ANYO , 36, 128 

APPLY , 92, 144 

ARB, 52, 150 

ARBNOO , 46, 130 

Arithmetic operators , 153 
addition, 19 
division, 19 

mux txpxxoa Lxuii , x^ 

negative , 8 
positive, 8 
subtraction, 19 

ARRAY {) , 104, 130 

Array 

creation, 100 
dimension, 103 
index, 105 
item reference, 101, 

106 
prototype, 110 

Assignment 

assignment rule, 10 
conditional assignment, 

38 
immediate assignment, 

40 

Assignment rule, 10 



BAL, 150 

Binary operators, 16, 153 
addition, 19 
alternation, 35 
concatenation, 17 
conditional assignment, 

38 
division, 19 

immediate assignment, 40 
multiplication, 19 
subtraction, 19 

BREAKO , 41, 128 



Carriage control, 146 

Character set representation, 
158 

CLOCK , 140 

CODEO , 145 

Comment card, 156 

Compilation 

during execution, 145 
of program text, 6 

Compiler, 6 

Compile-time error messages, 
166 

Concatenation, 17 

with indirect referencing, 

60 
with null value, 29 
within patterns, 39 

Conditional assignment, 3 8 

Conditional go-to, 23 

Continuation card, 155 



Index 



184 



CONVERT , 145 

Created variable, 101 
array item, 101 
name of, 116 
structure field, 135 



DATAO , 135 

DATATYPE , 136 

Datatypes, 126 
array, 100 
code, 145 
integer, 8 
name, 116 
pattern, 49 
programmer-defined , 

135 
real, 19 
string, 8 

DATEO , 140 

Declarations, 135 
DATAO , 135 
DEFINE , 135 

Deferred evaluation, 50 

DEFINE , 72, 135 

DETACH , 147 

DIFFERO , 26, 127 

Division, 19 



-EJECT, 156 
END, 23 

ENDGROUPO , 147 
EOIO , 148 



EORLEVELO , 148 

Entry label, 73 

EQ() , 28, 127 

Error messages 

compile-time , 166 
execution-time, 167 

Evaluation rule, 25 

Execution of programs , 6 

Execution-time error 
messages, 167 

Extended syntax, 156 

External variable, 80, 90 

FAIL, 150 

Failure 

in pattern matching, 33 
of input, 24 
of item reference, 106 
of procedure call, 26, 7 5 
of the rule, 24 

FAMILY , 133 

Family, 100, 138, 141 

FENCE, 151 

Field, 135 

Field selection procedure, 
135 

FIRSTO , 131 

Flow of control, 21 

FNCLEVELO , 141 



Index 



185 



Formal variable, 72 
FREEZEO , 148 
FRETURN, 75 



ITEMO / 108, 143 

Item, 101 

Item reference, 101 



GEO , 28, 127 

Go-to 

conditional, 23 
unconditional, 22 
with indirect 

referencing, 67 

GT{) , 28, 127 



IDENTO , 26, 127 

Identifier form, 9 

IF() , 144 

Inunediate assignment,- 40 

Indirect referencing, 55 

Infinite loop. See Loop, 
infinite 

INPUT, 13 

failure of, 24 

INPUTO , 146 

Input/output procedures, 
146 

Integer, 8 

Integer literal, 9 

Internal variable, 72, 
76, 78 

Interpreter, 6 



Label, 21 

LEO , 28, 127 

LEFTO , 132 

LENO , 42, 129 

LGTO , 27, 127 

-LIST, 156 

Listing control card, 156 

Loop, 29 

infinite. See Infinite 
loop 

LT() , 28, 127 

MAXLNGTHO , 141 
Multiplication, 19 

Name 

of created variable, 101, 

116 
of natural variable, 9, 

56, 101, 116 

Name operator, 116 
NE() , 28, 127 
Negative, 8 
NEXTVARO , 141 



Index 



186 



NOTANYO , 36, 128 
NRETURN, 75, 90, 118 
Null value, 11 
Numeric string, 8 

Omitted argument, 77, 126 

Operators, 16 

summary of, 153 

OUTPUT, 12 

OUTPUT , 145 

PARAMO , 131 

Passing of arguments, 77 

Pattern matching, 33 

Pattern-matching rule, 33 

POS{) , 46, 129 

Positive, 8 

Precedence, 153 

Predefined pattern 

variables, 52, 150 

Predefined procedures 
summary of, 123 
ALPHABET , 140 
ANCHORO , 43, 145 
ANYO , 36, 128 
APPLYO , 92, 144 
ARBNOO , 46, 130 
ARRAY , 104, 130 
BREAKO , 41, 128 
CLOCK , 140 
CODEO , 145 



CONVERT , 145 
DATAO , 135 
DATATYPE , 136 
DATEO , 140 
DEFINE , 72, 135 
DETACH , 147 
DIFFERO , 26, 127 
ENDGROUPO , 147 
EOIO , 148 
EORLEVELO , 148 
EQO , 28, 127 
FAMILY , 133 
FIRST , 131 
FNCLEVELO , 141 
FREEZE , 14 8 
GEO , 28, 127 
GTO , 28, 127 
IDENTO , 26, 127 
IFO , 144 
INPUT , 146 
ITEMO , 108, 143 
LEO , 28, 127 
LEFTO , 132 
LENO , 42, 129 
LGTO , 27, 127 
LT() , 28, 127 
MAXLNGTHO , 141 
NE() , 28, 127 
NEXTVARO , 141 
NOTANYO , 36, 128 
OUTPUT , 146 
PARAMO , 131 
POSO , 46, 129 
PROTOTYPE , 110, 137 
REMARK , 147 
RESTO , 131 
REWIND , 147 
RIGHTO , 132 
RPOSO , 46, 130 
RTABO , 44, 129 
SELECTOR , 134 
SIZEO , 16, 136 
SPANO , 41, 128 
STCOUNTO , 140 
STLIMITO , 141 
TABO , 44, 129 
TIMEO , 140 
TRIMO , 15, 130 
TYPEO , 111, 136 



xnaex 



1R7 



Procedure call, 14, 76 
argument of, 77 
failure of, 26, 75 
level of, 87 
recursive, 74 
side effect of, 84 
summary of execution 
of, 154 

Procedure definition, 70 
DEFINE , 72 
entry label, 73 
formal variable, 72 
internal variable, 

72, 76, 78 
procedure body, 74 
procedure name, 72 
result variable, 75 

Procedure reference, 14 

Procedures, 14, 70 

predefined, summary 

of, 123 
programmer-defined, 

70 

Program execution, 6 

Program text 

representation, 155 

Programmer-defined 

datatypes, 135 

Programmer-defined 

procedures, 70 

DEFINE , 7 2 

entry label, 73 

external variable, 
80, 90 

formal variable, 72 

F RETURN, 75 

internal variable, 
72, 76, 78 

NRETURN, 75, 90, 118 

procedure body, 74 

procedure name, 72 

recursive, 74 



result variable, 75 

RETURN, 75 

returning a variable, 

90 
side-effect, 84 
summary of execution 

of, 154 

PROTOTYPE , 110, 137 

Prototype 

of array, 110 
of name, 139 
of pattern, 13 8 
of structure, 137 
predefined, 138 



Quotation marks, 157 

Real literal, 145 

Real number, 19 

Recursive procedure call , 
74 

REM, 52, 150 

REMARK , 147 

Replacement rule, 34 

RESTO , 131 

Result variable, 75 

RETURN, 75 

REWIND , 147 

RIGHTO , 132 

RPOSO , 46, 130 

RTABO , 44, 129 



Index 



188 



Rule 

assignment, 10 
evaluation, 25 
pattern-matching, 33 
replacement, 34 



S ELECTOR , 134 

Selector, 106 

SIZEO , 16, 136 

-SPACE, 156 

SPANO ,- 41, 128 

Statement terminator, 155 

STCOUNTO , 140 

STLIMITO , 141 

String, 8 

String literal, 8 

String reference, 33 

Subtraction, 19 

Syntax 

extended, 156 

of program texts, 161 



System transfers 
END, 23 



FRETURN, 75 
NRETURN, 75, 90, 118 
RETURN, 75 



TABO , 44, 129 

Test procedures, 127 
predefined, 26 
programmer-defined, 81 

TIMEO , 140 
TRIMO , 15, 130 
TYPEO , 111, 136 



Unanchored pattern matching, 
44, 145 

Unary operators, 16, 153^ 
deferred evaluation, 50 
indirect referencing, 55 
name, 116 
negative , 8 
positive, 8 

-UNLIST, 156 



Variable, 9 

created, 101, 116 
external, 80, 90 
internal, 72, 76, 78 
natural, 9, 56, 101, 116 



